Skip to content

marilenalaz/Big-Data-Analytics-Health-insurance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Health insurance marketplace

image

Over the last decade, there has been a gradual increase in the cost of healthcare, a trend that has attracted substantial attention from both economists and data scientists, and from political circles. It is worth noting that this escalation has further intensified in the wake of the prolonged restrictions imposed by the COVID-19 pandemic. The two-year period of intermittent confinement and public health restrictions has caused an increased awareness among the global population of the complexities, vulnerabilities inherent in healthcare systems and the increased use of private insurance companies for financial management between the state and citizens.

Based on the theoretical findings on the use of Big Data in healthcare, I utilized Apache Spark Ecosystem to implement exploratory analysis and predictive analytics.

The goal of the project is to understand the marketplace of the United States for the year 2014-2016, through visualization. The input data are extracted from the US DEPARTMENT OF HEALTH AND HUMAN SERVICES (https://www.kaggle.com/datasets/hhs/health-insurance-marketplace). Also, the data source for the geographic data used to create the graphs is the National Census Bureau (https://www.census.gov/cgi-bin/geo/shapefiles/index.php). The coding language is Python and more specifically I use the data framework Apache Spark, due to the capabilities of the ELT. Each subset of data that is generated, is saved in the NoSQL database Apache Cassandra.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published