3.6 Feature importance: Mutual information

Notes

Mutual information is a concept from information theory, which measures how much we can learn about one variable if we know the value of another. In this project, we can think of this as how much do we learn about churn if we have the information from a particular feature. So, it is a measure of the importance of a categorical variable.

Classes, functions, and methods:

mutual_info_score(x, y) - Scikit-Learn class for calculating the mutual information between the x target variable and y feature.
df[x].apply(y) - apply a y function to the x series of the df dataframe.
df.sort_values(ascending=False).to_frame(name='x') - sort values in an ascending order and called the column as x.

The entire code of this project is available in this jupyter notebook.

⚠️	The notes are written by the community. If you see an error here, please create a PR with a fix.

Notes from Peter Ernicke

Navigation

Machine Learning Zoomcamp course
Session 3: Machine Learning for Classification
Previous: Feature importance: Churn rate and risk ratio
Next: Feature importance: Correlation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

06-mutual-info.md

06-mutual-info.md

3.6 Feature importance: Mutual information

Notes

Navigation

Files

06-mutual-info.md

Latest commit

History

06-mutual-info.md

File metadata and controls

3.6 Feature importance: Mutual information

Notes

Navigation