Skip to content

zanete/udacity-da-p3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenStreetMap Data Wrangling Project

Zanete Ence

This directory contains a set of scripts that audit and clean open street map XML data and turn it into a json format for MongoDB It contains the following files:

  • map.txt: contains the URL to the full download of OpenStreetMap XML for the area of Crawley, West Sussex, United Kingdom
  • crawley.osm: a random sample of the original dataset
  • Python scripts:
    • audit.py: Initial pass at the data set to confirm schema assumptions and produce an overview report
    • audit_tags.py: Secondary pass at the data with focus on contents of the tag elements, produces a csv with all key value pairs encountered
    • clean.py: Script that cleans and shapes the original XML data and transforms into json file containing list of map entities If CODE RUN:
  • tag_audit_report.csv: an export of all the key value pairs encountered and their count (produced by audit_tags.py)
  • clean_data.json: an export of map entities in json format (produced by clean.py)

Libraries used:

  • xml
  • pprint
  • pandas

About

OpenStreetMap Data Wrangling Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages