This Repository is a part of **100 days ML Code Challenge **
Day 0
July 6, 2018 Simple Linear RegressionLink to work: Sample Example
Day 1
July 7, 2018 Support Vector RegressionLink to work: Sample Example
Day 2
July 9, 2018 Multiple RegressionLink to work: Sample Example
Day 3
July 12, 2018 Logistic RegressionLink to work: Sample Example
Day 4
July 14, 2018 SVMLink to work: Sample Example
Day 5
July 15, 2018 KNNLink to work: Sample Example
Day 6
July 16, 2018 Kernel SVMLink to work: Sample Example
Day 7
July 17, 2018 Naive BayesLink to work: Sample Example
Day 8
July 18, 2018 Decision TreeLink to work: Sample Example
Day 9
July 19, 2018 Random ForestLink to work: Sample Example
Day 10
July 21, 2018 K-means ClusteringLink to work: Sample Example
Day 11
July 22, 2018 ClusteringLink to work: Sample Example
Day 12
July 23, 2018 Association Rule LearningLink to work: Sample Example
Day 13
Upper Confidence BoundLink to work: Sample Example
Day 14
Thompson SamplingLink to work: Sample Example
How do I know which model to choose for my problem ?
Same as for regression models, you first need to figure out whether your problem is linear or non linear.
If your problem is linear, you should go for Logistic Regression or SVM.
If your problem is non linear, you should go for K-NN, Naive Bayes, Decision Tree or Random Forest.
Then from a business point of view, you would rather use:
-
Logistic Regression or Naive Bayes when you want to rank your predictions by their probability. For example if you want to rank your customers from the highest probability that they buy a certain product, to the lowest probability. Eventually that allows you to target your marketing campaigns. And of course for this type of business problem, you should use Logistic Regression if your problem is linear, and Naive Bayes if your problem is non linear.
-
SVM when you want to predict to which segment your customers belong to. Segments can be any kind of segments, for example some market segments you identified earlier with clustering.
-
Decision Tree when you want to have clear interpretation of your model results,
-
Random Forest when you are just looking for high performance with less need for interpretation.
NOTE:
All algorithms are implemented using Python.