Spark ETL Sample Project

The goal of this project is to perform an extract, transform and load (ETL) process to migrate data into a local Apache Spark cluster.

ETL Process

Ask questions to get clarification
Install Apache Spark (note: if you experience too much trouble with setting up spark locally, then you may use duckdb instead)
Write code in Python using data files and GPG keys stored in this repo
Commit code to your repo and share link

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
slim.shady.pub.asc		slim.shady.pub.asc
slim.shady.sec.asc		slim.shady.sec.asc
titanic.csv.gpg		titanic.csv.gpg