This repository contains slides, notebooks, and datasets for the Machine Learning University (MLU) Decision Trees and Ensemble Methods class. Our mission is to make Machine Learning accessible to everyone. We have courses available across many topics of machine learning and believe knowledge of ML can be a key enabler for success. This class is designed to help you get started with tree based models, learn about widely used Machine Learning techniques and apply them to real-world problems.
Watch all class video recordings in this YouTube playlist from our YouTube channel.
There are five lectures, one final project and five assignments for this class.
Lecture 1
title | studio lab |
---|---|
Decision Trees | - |
Impurity Functions | - |
CART Algorithm | - |
Regularization |
Lecture 2
title | studio lab |
---|---|
Bias-variance trade-off | - |
Error Decomposition | - |
Extra Trees Algorithm | |
Bias-variance and Randomized Ensembles | - |
Lecture 3
title | studio lab |
---|---|
Boostrapping | |
Bagging | Bagging tree correlation |
Random Forests |
Lecture 4
title | studio lab |
---|---|
Random Forest Proximities | - |
Some use cases for Proximities | - |
Feature Importance in Trees | |
Feature Importance in Random Forests |
Lecture 5
title | studio lab |
---|---|
Boosting | |
Gradient Boosting | - |
XGBoost, LightGBM and CatBoost | CatBoost LightGBM |
Final Project
title | studio lab |
---|---|
Final Project |
Final Project: Practice working with a "real-world" computer vision dataset for the final project. Final project dataset is in the data/final_project folder. For more details on the final project, check out this notebook.
Interested in visual, interactive explanations of core machine learning concepts? Check out our MLU-Explain articles to learn at your own pace!
Including relevant articles for this course: Decision Trees, Random Forest, and the Bias Variance Tradeoff.
If you would like to contribute to the project, see CONTRIBUTING for more information.
The license for this repository depends on the section. Data set for the course is being provided to you by permission of Amazon and is subject to the terms of the Amazon License and Access. You are expressly prohibited from copying, modifying, selling, exporting or using this data set in any way other than for the purpose of completing this course. The lecture slides are released under the CC-BY-SA-4.0 License. The code examples are released under the MIT-0 License. See each section's LICENSE file for details.