Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add files via upload #570

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions 03-classification/10-training-log-reg.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,54 @@ This video was about training a logistic regression model with Scikit-Learn, app

The entire code of this project is available in [this jupyter notebook](https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/chapter-03-churn-prediction/03-churn.ipynb).

## Notes on the solver parameter of LogisticRegression in scikit-learn
A solver refers to the optimization algorithm used to find the coefficients (or weights) that minimize the loss. Solvers are responsible for adjusting the model's parameters during training to fit the data as well as possible.

Here are the main solvers :
1. lbfgs (Limited-memory Broyden–Fletcher–Goldfarb–Shanno)

Type: Quasi-Newton method.
Use case: Good for small to medium datasets, handles multinomial loss (for multiclass classification).
Pros: Efficient for problems with large numbers of classes.
Cons: May struggle with very large datasets.

2. liblinear

Type: Coordinate descent algorithm.
Use case: Useful for small to medium datasets; it works well with binary classification and supports L1 and L2 regularization.
Pros: Suitable for smaller datasets and simpler models.
Cons: Does not handle multinomial classification directly (one-vs-rest is used instead).

3. saga (Stochastic Average Gradient Descent)

Type: Variation of stochastic gradient descent (SGD).
Use case: Best for large datasets, sparse data, and models with L1 (lasso) regularization.
Pros: Works well with large datasets, supports L1, L2, and elastic-net regularization.
Cons: Typically slower than liblinear on smaller datasets.

4. newton-cg (Newton’s Conjugate Gradient)

Type: Newton’s method with conjugate gradient optimization.
Use case: Suitable for large datasets and multinomial loss.
Pros: Can handle large datasets and problems with many classes.
Cons: More computationally expensive than lbfgs and saga.

5. sag (Stochastic Average Gradient)

Type: Stochastic gradient descent (SGD).
Use case: Suitable for large datasets and models with L2 regularization.
Pros: Fast on large datasets.
Cons: Only supports L2 regularization.

How to Choose a Solver:

If you’re working with a large dataset, try saga or sag.
If you need multiclass classification, lbfgs, saga, or newton-cg are good choices.
For small datasets, liblinear is often sufficient.
If you need L1 regularization or sparse data, saga is recommended.



<table>
<tr>
<td>⚠️</td>
Expand Down
47 changes: 47 additions & 0 deletions 04-evaluation/Solver for Logistic Regression.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
In the context of Logistic Regression in scikit-learn, a solver refers to the optimization algorithm used to find the coefficients (or weights) that minimize the loss function (in this case, the log-loss function for logistic regression). Solvers are responsible for adjusting the model's parameters during training to fit the data as well as possible.

Different solvers have different approaches and trade-offs between speed, memory usage, and convergence behavior. In logistic regression, the solvers optimize the likelihood function to find the best-fit parameters.

Here are the main solvers available in scikit-learn for LogisticRegression:
1. lbfgs (Limited-memory Broyden–Fletcher–Goldfarb–Shanno)

Type: Quasi-Newton method.
Use case: Good for small to medium datasets, handles multinomial loss (for multiclass classification).
Pros: Efficient for problems with large numbers of classes.
Cons: May struggle with very large datasets.

2. liblinear

Type: Coordinate descent algorithm.
Use case: Useful for small to medium datasets; it works well with binary classification and supports L1 and L2 regularization.
Pros: Suitable for smaller datasets and simpler models.
Cons: Does not handle multinomial classification directly (one-vs-rest is used instead).

3. saga (Stochastic Average Gradient Descent)

Type: Variation of stochastic gradient descent (SGD).
Use case: Best for large datasets, sparse data, and models with L1 (lasso) regularization.
Pros: Works well with large datasets, supports L1, L2, and elastic-net regularization.
Cons: Typically slower than liblinear on smaller datasets.

4. newton-cg (Newton’s Conjugate Gradient)

Type: Newton’s method with conjugate gradient optimization.
Use case: Suitable for large datasets and multinomial loss.
Pros: Can handle large datasets and problems with many classes.
Cons: More computationally expensive than lbfgs and saga.

5. sag (Stochastic Average Gradient)

Type: Stochastic gradient descent (SGD).
Use case: Suitable for large datasets and models with L2 regularization.
Pros: Fast on large datasets.
Cons: Only supports L2 regularization.

How to Choose a Solver:

If you’re working with a large dataset, try saga or sag.
If you need multiclass classification, lbfgs, saga, or newton-cg are good choices.
For small datasets, liblinear is often sufficient.
If you need L1 regularization or sparse data, saga is recommended.