Micha


Timeline

Mini-Discussion by Micha 15 minutes
Class Discussion 30 minutes
Hackathon 90 minutes
Present and Reflect 10 minutes
Project Questions 5 minutes
Total 150 minutes

Datasets

You should already have both of these datasets. Refer to Challenge Week 12 for Weather Data and the Yelp Hackathon.

Repository

Submit your answers here

Teams

Get into teams of 4 or 5 for this hackathon. You select your team. Do it ASAP so we can get started :)

Small Group Discussion

We are going to give you 15 minutes to discuss the following questions as a group. Don’t linger to long on a given question, reason through a simple answer and move on. We’ll discuss these as a group before starting the hackathon:

  • Data questions
    • How could the weather be affecting what you find in the yelp dataset? (Explain)
    • Could the yelp dataset have any affect on the weather dataset?
    • What insights would you be able to get from the yelp dataset? How could these be enhanced by connecting them to the weather dataset?
  • Methods Questions
    • What variables may contain useful cross-correlations between the two datasets?
    • What if you used the texts of the reviews instead of just the ratings? How could this change what you analyze?
  • Bias
    • What bias may exist in the datasets?
    • What if we told you only yelp reviews >3 were in the dataset? How would this change what you could learn from it?
    • How would we workaround the kinds of biases listed in the above two questions?

Hackathon Objectives

  1. Try to analyze how weather might affect yelp reviews
  2. Consider visualizations for showing this relationship
  3. Try and tell a story with your findings
  4. Look for another relationship in this dataset

Objective 1: Analyze

Start by choosing a location to analyze, and choose 5 days that were rainy and 5 days that were sunny:

  • Pull the weather data for those days
  • Pull the Yelp reviews for those days
  • Is the average review on each data correlated to the precipitation?
  • Is it correlated to the temperature?
  • Hypothesize how you’d be able to make your conclusions stronger. Tell us about the potential biases and weaknesses given the current analysis.

Objective 2: Visualize

Think about how you’d visualize this relationship:

  • What visualizations would be helpful for showing this correlation?
  • Which would NOT be useful?

Objective 3: Tell a story

Take the analysis you did in Objective 1 and implement a visualization from Objective 2. Use these statistics and graphics to write up a small story that would convince a reader of what you’ve found. Imagine you’re being asked to do a (small) writeup for a newspaper or magazine.

Objective 4: Dig Deeper

With whatever time you have left, go through this workflow again, but seek out a different correlation that your group has found or wants to see if it exists. Start by telling us why you chose this correlation.

No need to submit everything for Objective 1 & 2. Do your analysis and skip straight to choosing a visualization and writing up a story.