Timeline
Mini-Discussion by Micha | 15 minutes |
Class Discussion | 30 minutes |
Hackathon | 90 minutes |
Present and Reflect | 10 minutes |
Project Questions | 5 minutes |
Total | 150 minutes |
Datasets
You should already have both of these datasets. Refer to Challenge Week 12 for Weather Data and the Yelp Hackathon.
Repository
Submit your answers here
Teams
Get into teams of 4 or 5 for this hackathon. You select your team. Do it ASAP so we can get started :)
Small Group Discussion
We are going to give you 15 minutes to discuss the following questions as a group. Don’t linger to long on a given question, reason through a simple answer and move on. We’ll discuss these as a group before starting the hackathon:
- Data questions
- How could the weather be affecting what you find in the yelp dataset? (Explain)
- Could the yelp dataset have any affect on the weather dataset?
- What insights would you be able to get from the yelp dataset? How could these be enhanced by connecting them to the weather dataset?
- Methods Questions
- What variables may contain useful cross-correlations between the two datasets?
- What if you used the texts of the reviews instead of just the ratings? How could this change what you analyze?
- Bias
- What bias may exist in the datasets?
- What if we told you only yelp reviews >3 were in the dataset? How would this change what you could learn from it?
- How would we workaround the kinds of biases listed in the above two questions?
Hackathon Objectives
- Try to analyze how weather might affect yelp reviews
- Consider visualizations for showing this relationship
- Try and tell a story with your findings
- Look for another relationship in this dataset
Objective 1: Analyze
Start by choosing a location to analyze, and choose 5 days that were rainy and 5 days that were sunny:
- Pull the weather data for those days
- Pull the Yelp reviews for those days
- Is the average review on each data correlated to the precipitation?
- Is it correlated to the temperature?
- Hypothesize how you’d be able to make your conclusions stronger. Tell us about the potential biases and weaknesses given the current analysis.
Objective 2: Visualize
Think about how you’d visualize this relationship:
- What visualizations would be helpful for showing this correlation?
- Which would NOT be useful?
Objective 3: Tell a story
Take the analysis you did in Objective 1 and implement a visualization from Objective 2. Use these statistics and graphics to write up a small story that would convince a reader of what you’ve found. Imagine you’re being asked to do a (small) writeup for a newspaper or magazine.
Objective 4: Dig Deeper
With whatever time you have left, go through this workflow again, but seek out a different correlation that your group has found or wants to see if it exists. Start by telling us why you chose this correlation.
No need to submit everything for Objective 1 & 2. Do your analysis and skip straight to choosing a visualization and writing up a story.