Timeline
Show & Tell | 20 minutes |
Project | 10 minutes |
Challenges | 10 minutes |
Hackathon | 100 minutes |
Presentation | 10 minutes |
Total | 150 minutes |
Dataset
Data about this course’s Github organization.
https://github.com/CSCI-4830-002-2014
Objective
- Write a Node.js script to grab all the events about our class’s repository.
- Redo all the analyses in Splunk III, Challenge 2 on the newer data.
- Conduct new analyses on the events related to week 3 (challenge, project, hackathon) using Splunk.
Prerequisites
Team
Form a team of four or five with other classmates. You should work with others who are already sitting at the same table. The teaching staff will walk around to facilitate team forming. Introduce yourself if you do not already know each other.
Objective 1: Node.js program for grabbing events
Problem: We want to retrieve all the events about the course from Github and save them in a JSON file to import into Splunk. But Github only returns one page of 30 events at a time.
Desired Solution: A Node.js program that can make multiple calls to Github’s API to retrieve events over several pages and combine the results into a single list (i.e., a JSON array).
Installation
You will need to install two Node packages:
Example
Below is a Node.js program for grabbing all forks of the repos for the three weeks challenges. Your task is to adapt this code for grabbing all events about our course’s repository (or as many as allowed by Github).
Download and try to run from a terminal:
node getforks.js > forks.json
Open forks.json and you should see a list of forks as a JSON array.
Template
You can use this Gist as a template to develop your program.
Useful References
Github API | node-github |
GET /users/:username/followers | user#getFollowers |
GET /orgs/:org/events | events#getFromOrg |
Technical Discussion
- Why coding in Node.js? Why not Postman?
- What parameters to set to get Github’s API to return data beyond the most recent 30 events?
- What parameters to set in JSON.stringify that are responsible for printing to the console in a pretty way?
- What is the role of the callback function in the call to github.repos.getForks?
- Why is an asynchronous library needed?
- Why does results need to be flatten?
Objective 2: Redo Analyses
2.a. What is the distribution of push requests over Github accounts?
2.b. What are different event types compared over time for the whole class?
2.c. Who had the most number of pull request events?
2.d. How many different kinds of pull request actions were made?
2.e. What is the distribution of opened pull requests over Github accounts?
2.f. What is the submission pattern (i.e., pull requests) of the “Week 2 challenge” over time?
Objective 3: Analyze Week 3
Come up with THREE interesting questions about week 3 (challenge, project, hackathon). For each question, compose a Splunk query and present a chart or table as an answer. Write one or two sentences to discuss your answer.
Presentation
Your team will do a 3-5 minutes presentation to the entire class about the findings from your analyses.
Submission
Use a Github repository to submit your work. Follow this link to find the template repository for this hackathon. Fork, modify, commit, push, and make a pull request, just like what you did for your other homework submissions.