This project aims to design, create, and utilize a database to efficiently manage financial data, including information on Bitcoin, Gold, and the S&P 500. The goal is to analyze how these assets behave in relation to various economic factors such as inflation, interest rates, and market volatility (VIX). Through this project, we demonstrate the process of database design, data insertion, advanced querying, and data visualization.
The data used in this project originates from a previous project where we consolidated financial data from multiple sources, resulting in the df_combined.csv file. This file includes:
- Assets Data: Historical prices of Bitcoin, Gold, and the S&P 500.
- Economic Factors: Information on market volatility (VIX), interest rates, CPI, and inflation.
- Data Integrity: Ensuring the consistency and accuracy of data transferred from the CSV to the database. Our data comes from the previous project multi-asset-financial-analysis
- Efficient Queries: Optimizing SQL queries for speed and accuracy, especially with large datasets.
-
How to design an efficient database to manage dates, assets, and key economic metrics?
- Solution: Implement a relational database model with well-defined tables and relationships, leveraging foreign keys to maintain data integrity.
-
How to insert data securely and efficiently into the database?
- Solution: Use Python scripts to automate the data insertion process, ensuring that data is inserted securely and efficiently, with integrity constraints enforced.
-
How to perform queries to retrieve relevant information from the database?
- Solution: Utilize SQL features like JOIN, GROUP BY, ORDER BY, CASE, and subqueries to extract and analyze the data. Use indexes to optimize query performance.
The project began with defining the problem: efficiently managing and analyzing financial data. Hypotheses were formulated on how to design the database structure to facilitate this analysis, focusing on optimizing the database for financial queries.
The database was designed using a relational model with three main tables:
dates
: Stores unique dates and their correspondingid_date
.assets
: Contains data on the prices and changes of Bitcoin, Gold, and the S&P 500, linked toid_date
.economic_factors
: Stores economic indicators such as VIX, interest rates, CPI, and inflation, also linked toid_date
.
The SQL scripts for creating these tables are included in the create_database.sql
file.
Data from df_combined.csv was inserted into the database using Python and SQLAlchemy. The insertion process was handled with care to maintain data integrity, utilizing foreign keys and ensuring correct relationships between tables.
Five advanced SQL queries were developed to analyze the data:
- Annual Growth Analysis: Calculated the yearly growth rates of Bitcoin, Gold, and the S&P 500.
- Monthly Average and Volatility: Analyzed the monthly average prices and volatility of the assets.
- Impact of Interest Rates on Bitcoin: Investigated how low and high interest rates affect Bitcoin's price and volatility.
- S&P 500 Growth and Inflation: Explored the relationship between the growth of the S&P 500 and inflation rates.
- Bitcoin Performance under High Volatility and Inflation: Analyzed Bitcoin's average performance and volatility during periods of high market volatility (VIX > 30) and inflation (inflation > 3%).
Two detailed visualizations were created to illustrate the results of the SQL queries:
- Annual Growth Visualization: A line graph showing the annual growth of Bitcoin, Gold, and the S&P 500.
- S&P 500 Growth vs. Inflation: A dual-axis line chart comparing the S&P 500’s growth rate with the average inflation rate over the years.
These visualizations provide a clear and comprehensive view of the financial data, helping to understand the trends and relationships between the variables.
- Efficient Database Design: A well-structured relational database is crucial for managing and analyzing financial data effectively.
- Secure and Efficient Data Insertion: Using Python and SQLAlchemy for data insertion ensures data integrity and efficiency.
- Powerful Querying Capabilities: Advanced SQL queries, when properly optimized, can yield deep insights into financial data.
- Visualization for Insight: Visualizing the results of queries is essential for interpreting and communicating the data's story.
- Scalability: Investigate ways to scale the database to handle even larger datasets, ensuring continued performance efficiency.
- Enhanced Security: Explore additional security measures to protect sensitive financial data, such as encryption and advanced access controls.
- Presentation: Project Presentation
- Database Creation Script: create_database.sql
- Query History: query_history.sql
This README provides an overview of the project, detailing the methodology, analysis, and key findings. For more information, please refer to the provided resources.