Skip to content

KeithGalli/disney-data-science-tasks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Disney Dataset Creation & Analysis

In this video we walk through a series of data science tasks to create a dataset on disney movies and analyze it using Python Beautifulsoup, requests, and several other libraries along the way.

Setup

To access all of the files I recommend you fork this repo and then clone it locally. Instructions on how to do this can be found here: https://help.github.com/en/github/getting-started-with-github/fork-a-repo

The other option is to click the green "clone or download" button and then click "Download ZIP". You then should extract all of the files to the location you want to edit your code.

Installing Jupyter Notebook: https://jupyter.readthedocs.io/en/latest/install.html

Background Information

This repo goes along with my video "Solving real world data science tasks with Python BeautifulSoup!

In this video we scrape Wikipedia pages to create a dataset on Disney movies.

The video is formatted with tasks for you to try to solve on your own throughout. For the best learning experience, at each task you should pause the video, try the task on your own, and then resume when you want to see how I would solve it.

We cover a wide range of Python & data science topics in this video. They include:

  • Web scraping with BeautifulSoup
  • Cleaning data
  • Testing code with Pytest
  • Pattern matching with regular expressions (Re library)
  • Working with dates (datetime library)
  • Saving & loading data with Pickle library
  • Accessing data from an API using Requests library

To see the steps to create the dataset, check out dataset-creation.ipynb
In a future video we will analyze the dataset in dataset-analysis.ipynb

Save/Load the Datasets

  • If you want to jump into a specific task, feel free to utilize the dataset checkpoints.
  • To load these files you can look at the functions found in this file.
  • If you want to just do analysis on the final dataset, check out this folder.

About

Creation of a Disney Movie Dataset & Analysis using Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published