For the Code Louisville data analysis final project, I wanted to extend my work from the last cohort. This desire drove me to investigate Facebook.
In my data breaches analysis, I found that Facebook had experienced the most reported data breaches. In order to pursue this investigation into Facebook, I located and downloaded a dataset of their stock history found on Kaggle. I wanted to see if there were any effects from a financial perspective.
Table of Contents
Note: I started working on this project using Anaconda which utilized python 3.9.13. After careful consideration and a failed update, I decided to try python 3.10.9.
Special Instructions: The pre-release version of Jupyter needs to be installed within Visual Studio Code.
Use the following syntax to install the required packages:
pip install <package>
- pandas
- seaborn
- matplotlib
- statistics
- Data Breach Analysis
- This notebook is an altered version of the original analysis, where it has been appended to build a pivot table of annual averages of compromised records pertaining to Facebook. Then, it outputs the derived pivot table to a CSV file.
- Facebook Stock Analysis
- This notebook builds a pivot table of monthly stock averages and outputs it to a CSV file.
- Effects on Facebook Stocks
- This notebook merges the pivot tables built from the Data Breach Analysis and Facebook Stock Analysis notebooks. Then, it outputs the merged data to a CSV file.
- Looker Studio
- The dynamic visualization tool that I utilized to construct an interactive dashboard of the data. There is a link under the data section.
Please build and run the jupyter notebooks in the following order:
-
- Made a pivot table
- Wrote in Jupyter's markdown cells explaining my thought process and code.
-
- Made a pivot table
- Wrote in Jupyter's markdown cells explaining my thought process and code.
-
- Performed a pandas merge with two data sets,then calculated some new values based on the new data set.
- Wrote in Jupyter's markdown cells explaining my thought process and code.
Upon investigation, I found some questions to ask against the dataset:
- What are the highest and lowest prices of stock?
- What are the highest and lowest numbers of traded stocks?
- Is there a relationship between the price of a stock and the number of traded stocks?
- Read two data files (JSON,CSV, Excel, etc.)
- Performed a pandas merge with two data sets, then calculated some new values based on the new data set.
- Made a dashboard to display the data.
- Made a pivot table.
- Built a custom data dictionary.
- Wrote in Jupyter's markdown cells explaining my thought process and code.
Sources
-
https://www.kaggle.com/datasets/hishaamarmghan/list-of-top-data-breaches-2004-2021
-
https://www.kaggle.com/datasets/kalilurrahman/facebook-stock-data-live-and-latest
Facebook Data Dictionary
Column | Description | Data Type | Field type |
---|---|---|---|
Date | Time Format: MM/DD/YYYY | Object | Origin |
Year | Component of the Date | int32 | Origin |
Month | Component of the Date | int64 | Origin |
Formatted Date | Combination of Year and Month | object | Formulated |
Breach | Indication that a breach occurred | int64 | Formulated |
Records | The number of records that were compromised in a breach | object | Origin |
Open | The opening price of stock | float64 | Origin |
High | The highest price of stock | float64 | Origin |
Low | The lowest price of stock | float64 | Origin |
Close | The closing price of stock | float64 | Origin |
Volume | The number of traded stocks | float64 | Origin |