Skip to content

Latest commit

 

History

History
17 lines (9 loc) · 556 Bytes

README.md

File metadata and controls

17 lines (9 loc) · 556 Bytes

Project Description

An analysis of NYC Taxi-cab data using python and spark

(Incomplete) Instructions

Download the full dataset here: http://www.andresmh.com/nyctaxitrips/ or use the subset in data/

Download weather data (fill in your API key for forecast.io first) using python/get_weather_data.py

Fix hardcoded paths in python/generate-models.py to point to the correct data and python directories

Run locally with spark-submit

ToDo: Clean up hardcoded paths

NOTE: This is still a WIP -- the model developed here is expository only