How to be successful at Hackathons and Datathons for Data Scientists

How to be successful at Hackathons and Datathons for Data Scientists

Regardless of your skill level, you can only benefit. Photo Credit: TAMU Datathon organizers.


Share:          
  1. Introduction
  2. Approach
  3. Types of Challenges
  4. Teams
  5. Technical Knowledge
  6. Presentation
  7. Summary

Introduction

In the last couple of weeks, I attended two Hackathons and one Datathon. I recieved a lot of feedback on my endeavours and I want to share my experience with you. As an aspiring Data Scientist, you may have heard of such events and may be curious what they are all about. First off, they are both very similar and present you with different tasks. They are usually hosted by university departments and organized by their students for other students. They are typically held on weekends in Spring and Fall and run for exactly 24 hours from Saturday morning till the same time on Sunday.

Hackathon events have now been around for about ten years and often times incorporate Data Science and Machine Learning aspects. But within the last year a new type of -thon was created focusing specifically on Data Science challenges: the DATAthon.


GTrendsDatathon

Google Trends for search term “datathon”. Takeoff in 2017.


GTrendsHackathon

Google Trends for event “Hackathon”. Takeoff in 2010.

Approach

First of all, you should not think that you HAVE to win this competition and need to be an expert to be even considered when you apply for it. Especially if it is your first time. All backgrounds and experience levels are accepted and will be present, from first semester Bachelor’s level to last year PhD. You will benefit a lot in any case, just make the jump.

As an (aspiring) Data Scientist you have likely have diverse backgrounds but not necessarily pure coding experience as many Computer Science majors may have. But that is not an issue, since many challenges will require specialists that know the types of data very intimately. And someone needs to know how to interpret the input and what questions to ask! You can form a team with people who complement your knowledge.

In order to prepare:

  • Apply for the Hackathon/Datathon before the deadline and do not forget to RSVP when you are approved.
  • Make it a habit to read up on Data Science news. For example on Towards Data Science or Medium.
  • Follow Data Science and Data Engineer influencers who write blog articles and make vlogs for Youtube.
  • Practice working with real-world data sets and participate in data science competitions. For example, you could work on Kaggle.
  • Look how other people are solving problems, for example on blogs.
  • Get help from Stackoverflow.
  • Go to local data analysis and machine learning workshops and meetups.
  • Work with new algorithms you don’t know yet.
  • Create a GitHub repository: showcase your solutions and projects and fork popular teaching repositories (many times employers will look into your GitHub and want to see an active portfolio).
  • Practice at home, over and over.


TAMUHack1

Photo Credit: TAMU Datathon organizers

Types of Challenges

Hackathons have relatively open-ended tasks and give you a lot of freedom to what exactly you solve. Hackathon problems will require your creativity to create a solution. This means you have to be cognizant of problems society and individuals may have and an idea how to solve them with technology and software.

However, Datathon and Data Science challenges are different. Often times sponsoring companies will contribute with their own data sets that pertain to their everyday business. Find out which companies are participating and you can estimate what kind of problems they want you to solve. For example, if a company that works with business contracts and a lot of text in general, then you might expect a Natural Language Processing challenge. Or an engineering company might present a problem involving predictive maintenance, and data received from IoT sensors.

In any case, all challenges are hard enough that you will have time for only a single one. So choose wisely!

Teams

Hackathons and Datathons allow you to form teams of up to four members. Few participants will work alone and teamwork helps to train your organizational skills, work division, and communication. Make use of this opportunity and communicate with other participants in advance. If you already have someone for your team – good. If not, don’t worry either. You will have chances to find team members on the day of the event, between registration and the start of hacking. However, if you have formed your team in advance, it will save you time on the competition day that you may want to use for actual work. Commonly the organizers will set up a Slack channel where members can communicate starting right after acceptance into the event. Make use of this opportunity and find a team balanced in experience and speciality.


TAMUTAG

Photo Credit: TAMU Datathon organizers

Technical Knowledge

I reiterate: It does not matter how much or how little experience you have. You will learn new things regardless and you will be a worthwhile addition to your team. Joining a team that does NOT do what you already know will actually be an opportunity to expand your horizon. If you are not enrolled in a newly minted Data Science program at a university and are starting out from zero, I can recommend the following:

  • Learn coding in Python and/or R, the two most popular Data Science languages. Consider introductory and mostly free online courses on edX, Coursera, Udacity or other (in person) bootcamps. See also: PCWorld Best Python Courses.You do not need to become a wizard to wield powerful tools.
  • Get a deeper understanding of statistics than your undergrad 101 course. Data set exploration benefits from simple statistical tests and machine learning model evaluation relies heavily on statistics.
  • Continue with getting to know specific Data Science, Machine Learning, and then Deep Learning libraries for Python and R. For example, in Python that would be: Pandas, Numpy, Scipy, SciKit-Learn, SciKit-Image, TensorFlow, Keras are some popular ones. Have a look into this list: Best Machine Learning and Deep Learning Courses
  • Learn the workflows from data set exploration, modeling, evaluation, and how to iterate towards a better model.
  • Start using APIs to get access to real-world data. Many webservices offer their own APIs to request large amounts of data with little effort. Look into the developer section of their website. For example, Twitter, Foursquare, and Wikipedia offer API access, just to name a few.
  • Learn about graph plotting and how to visually show what your message is. For example, plot with Python’s matplotlib and seaborn and R’s Ggplot2. If you are bold, learn to create dashboards that let you manipulate graphs with sliders.

Presentation

When judging time comes, your team will have to present their project to a judge from the hosting university and sometimes professional mentors. You will have 4-5 minutes to present what you did. Summarize what the business question your project tackled and why it is relevant. Why is your solution innovative and how does it benefit the organization? Learn how to make effective (Powerpoint) slides and sum up a project without getting bogged down in details. Judges may be less interested in the exact technical details, but will focus on results and interpretation. It is likely that the organizers will announce on the event website/slack what the exact judging criteria are. Discuss which team member presents what and communicate confidently. Now graphing skills will come in handy because they are distilling your findings into something easy to grasp. If your team did not succeed in finishing the project, that is okay. Explain what your approaches were, how far you got, and what you found challenging.


UNCHack

Photo Credit: Christian Haller

Summary

Hackathons are a great place to learn and get your feet wet.

The Data Science subject is by no means exclusively tractable for Computer Science majors, and actually requires people with domain knowledge to pose the right questions and interpret data. That is true for your team as well. A Hackathon or Datathon is a chance to learn, enrich yourself with this new skillset, and learn new approaches. You do not have to be an expert to have your application approved. A Hackathon or Datathon may just be your way to embark on the journey towards Data Science.

If you have more questions, feel free to send me an E-Mail or message on LinkedIn.


© 2023. All rights reserved. Hosted on GitHub, made with https://hydejack.com/