Disclaimer: This repository is a sketchbook learning the background of decision tree algorithms. It is neither clean nor readable. Please direct yourself to Chefboost repository to have clean one.
This is the repository of Decision Trees for Machine Learning online course published on Udemy. In this course, the following algorithms will be covered. All project is going to be developed on Python (3.6.4), and neither out-of-the-box library nor framework will be used to build decision trees.
1- ID3
2- C4.5
3- CART (Classification And Regression Trees)
4- Regression Trees (CART for regression)
6- Gradient Boosting Decision Trees for Regression
7- Gradient Boosting Decision Trees for Classification
8- Adaboost
Just call the decision.py file to run the program. You might want to change the running algorithm. You just need to set algorithm variable.
algorithm = "ID3" #Please set this variable to ID3, C4.5, CART or Regression
Moreover, you might want to apply random forest. Please set this to True in this case.
enableRandomForest = False
Furthermore, you can apply gradient boosting regression trees.
enableGradientBoosting = True
Besides, adaptive boosting is allowed to run
enableAdaboost = True
Finally, you can change the data set to build different decision trees. Just pass the file name, and its column names if it does not exist.
df = pd.read_csv("car.data"
#column names can either be defined in the source file or names parameter in read_csv command
,names=["buying","maint","doors","persons","lug_boot","safety","Decision"]
)
Pandas and numpy python libraries are used to load data sets in this repository. You might run the following commands to install these packages if you are going to use them first time.
pip install pandas
pip install numpy
To keep yourself up-to-date you might check posts in my blog about decision trees
This repository is licensed under the MIT License - see LICENSE for more details.