Timeline
Show & Tell | 10 minutes |
Project | 10 minutes |
Hackathon | 110 minutes |
Presentation | 10 minutes |
Total | 150 minutes |
Dataset
Yelp Data Challenge |
Click the big red download button to start downloading.
Repository
https://github.com/CSCI-4830-002-2014/hackathon-yelp |
Objectives
- Ingest Yelp data into a MongoDB database hosted on MongoLab [https://mongolab.com/]
- Query the database to ask interesting questions
- Serve results of interesting queries using express.js
- Create D3 visualization of the result of interesting queries
Prerequisites
Team
The class will be divided into four teams. Team assignment will be facilitated by the teaching staff.
Objective 1: Ingest
Download Yelp data, which has five datasets: business, tip, user, checkin and review, Use MongoLab to create a MongoDB database to store the data. The free tier of MongoLab provides 500MB. It is sufficient for storing the business”, tip, user, and checkin (not review). Create four collections.
Ingest data into your database. The “tools” tab of MongoLab’s control panel provides some tip on how to do this.
If successful, you should see in the “Collections” tab the four collections listed.
Then, you should be able to connect to the database from a local terminal.
List all the collections. Then try a simple query to find restaurants in the city of Middleton that are good for kids, like this:
Submit a screenshot of your terminal output similar to above to demonstrate that your team has accomplished the data ingestion step.
Objective 2: Query
For this objective, we will do something similar to the “bird strike” hackathon. Each person will contribute ONE interesting question and post it in the hackathon repository as an issue. You must have a rough idea how you may answer the question yourself, to keep it reasonable.
After questions are posted, work as a team to tackle the questions raised by other teams. Post your answer as a comment. The answer should include (1) mongodb queries, (2) mongodb output, and (3) a short sentence explanation of the output.
Objective 3: Serve
Write a simple web serve to serve the results of interesting queries. Your are provided with skeleton code as an example you can build upon. Clone the hackathon repository. Get the server code to run on a localhost. You will need to enter the username, password, and the url to access the database on MongoLab. Open a browser and point it to http://localhost:3000. You want to be able to see something like this:
Open json under Good for Kids in Middleton, you should be able to see:
Open html under Good for Kids in Middleton, you should be able to see:
In app.js, modify the code for q1 and q2 to serve two of the questions your team came up with (which were answered by other teams). The lines marked by TODO suggest where you may need to modify
After you are done, commit your changes to app.js.
Submit the content of the web pages served by the these links as screenshots:
Objective 4: Visualize
Create a custom D3 visualization for an interesting query that returns lots of data points. After you got the skeleton code running on the localhost, open the link d3 under the example question. You should see something like below. Use Chrome to examine the ‘dataset’ variable in the Javascript console. Make sure the data is properly loaded. There should be three items.
Also, examine that there are three g elements corresponding to the three data items.
As a warm-up exercise, try to display ALL businesses in Middleton. Figure out where in app.js you need to modify to produce a visualization like below.
Design a custom visualization for the interesting query you’ve picked. Draw a sketch first. Implement the visualization by modifying views/custom.html.
When you are done, commit your changes. Then, in the template (README.md), submit (1) a photo of your hand-drawn sketch, and (2) a screenshot of the visualization shown in a web browser.