Skip to content

asheone/udacity_data_eng_project1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a repository for the Udacity's 'Data Modeling with Postgres' project made by Lukasz Aszyk.

It contains 6 files: etl.ipynb, test.ipynb, sql_queries.py, create_tables.py, etl.py, and README.md.

The first Jupyter notebook file etl.ipynb, contains the code prepared for further use in the etl.py file for the whole ETL process.

The second test.ipynb file contains the validation code and checks if all the tables were created and contains the required data.

The third file, sql_queries.py contains all the SQL statements, for tables dropping, creation, and inserting data into tables.

The create_tables.py file contains the execution python code for connecting to hte database and running the SQL queries.

The etl.py file contains 4 functions: process_song_file, process_log_file, process_data, and main. All the functions have the descriptions provided.

For running the project, first download the packages pip install goodtables, pprint, open the terminal and run the create_tables.py first and next run the etl.py file.

About

Udacity's Data Engineering Nanodegree repository

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published