Skip to content

nitanu32/Movie-Classification

Repository files navigation

Movie Classification

This dataset was extracted from a dataset from Cornell University(http://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html). After the Data 8 team transformed the dataset (e.g., converting the words to lowercase, removing the naughty words, and converting the counts to frequencies), they created this new dataset containing the frequency of 5000 common words in each movie. This is my attempt to build a classifier that guesses whether a movie is a comedy or a thriller, using only the number of times words appear in the movies's screenplay. This project shows my ability to build a k-nearest-neighbors classifier and test a classifier on data. This project also involves Exploratory Data Analysis using Linear Regression.

Tools: Jupyter Notebook, Python, NumPy, Matplotlib

Created as part of the Data 8 class @ UC Berkeley

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published