Skip to content

Stack overflow is a professional community for developers. This repo analysis 3 years of developer Survey done by Stackoverflow and do visualization and predict the salary of Data Scientist in future.

License

Notifications You must be signed in to change notification settings

recodehive/Stackoverflow-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Stackoverflow Analysis Guidelines

Stackoverflow Logo

All Contributors

Stars Badge Forks Badge Pull Requests Badge Issues Badge GitHub contributors License Badge

This is the all in one place for documentation help regarding the postman challenge.

πŸ‘¨β€πŸ’» Demo Video

Final.video.mp4

πŸ‘‡ Prerequisites

Before installation, please ensure you have the following tools installed:

πŸ› οΈ Installation Steps

  1. Fork the project: Fork the sanjay-kv/Stackoverflow-Analysis repository. Follow these instructions on how to fork a repository.
  2. Clone the project: git clone [email protected]:your-username/Stackoverflow-Analysis.git
  3. Download the original data from the drive link.
  4. Open Jupyter Notebook and place the file in the project folder. Make sure you're selecting the correct path.

Development

We welcome contributions from all levels of experience. If you think the community would benefit from being walked through the steps you're going through, please share! ❀️

Finding Insights from Stack Overflow Developer Survey

Objective:

To perform Analysis on 3 years Stackoverflow dataset to gain insights.

Goals:

  • Analyze the impact of higher education on the salary of the surveyed developers.
  • Investigate the impact of education/experience/responsibilities on gender inequalities.
  • Examine the impact on participation rates due to different ethnicities.
  • Determine if there is a difference between men's and women's incomes.
  • Analyze the increase in popularity of a language in the current year due to the developer’s interest in the previous year.

Stack Overflow is a professional community for developers, conducting an annual survey. Analyzing the dataset professionally using modern tools can enable us to answer real-world questions effectively. The dataset covers 275 questions in total.

Project Goals:

  1. Perform Analysis on the last 3 years Stack Overflow dataset to extract insights.
  2. Analyze the impact of higher education, experience, and responsibilities on salary and gender inequalities.
  3. Investigate participation rates based on ethnicity and differences in income between men and women.
  4. Explore the popularity of programming languages and predict their growth based on survey responses.

Data Source and Background

The dataset comes from the annual Stack Overflow developer survey, covering responses from developers in 180 countries. The data are available in CSV format, ranging from 40 to 150 MB, with responses from 1.5 Lakh survey participants.

Data Format

The data is in a CSV file format with 252,199 observations and 62 variables.

Expected Work

Data wrangling tasks include handling null values and converting data for analysis. Techniques such as ML algorithms and data visualization will be employed.

πŸ‘¨β€πŸ’» Contributing

πŸ›‘οΈ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ’ͺ Thanks to all Contributors

Thanks to all contributors for helping this project grow! 🍻

Contributors

πŸ™ Support

Don't forget to leave a star ⭐️ for this project!

Crafted with β™₯ by @sanjay-kv.

Back to Top

Here's a link to the project wiki: Stackoverflow Analysis Wiki