Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements sample_weight and optional permutation and SHAP importance, categorical features, boxplot #100

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

ThomasBury
Copy link

Hi,

It took me a while but finally found the time to work on the continuation of the discussion
#77

Meaning:

  • Not introducing new dependencies, a check import is performed if the User wants to use SHAP or get the matplotlib boxplot
  • Permutation importance (sklearn) is also implemented but optional and easy to switch off
  • Categorical features are encoded if any (optional)
  • sample_weight can now be passed to the fit method
  • A notebook illustrates the changes and compares the original Boruta_py and the new features
  • Add a note in the readme

@danielhomola
Copy link
Collaborator

Hi Thomas,

Thanks for this! Will try to find time in the next few weeks to go through it (it's quite a lot). To start with however, can we make sure that no .idea and .ipython-checkpoints file are committed? Thanks!

@ThomasBury
Copy link
Author

Hi Thomas,

Thanks for this! Will try to find time in the next few weeks to go through it (it's quite a lot). To start with however, can we make sure that no .idea and .ipython-checkpoints file are committed? Thanks!

Sorry, they were remains of the previous ignore, .idea and checkpoints are removed. I hope the notebook will be helpful, do not hesitate if you have any questions/remarks.

Thanks

@erikvdp
Copy link
Contributor

erikvdp commented Nov 22, 2021

This seems like a really cool PR!
Is there any chance that it will get merged soon?

@ThomasBury
Copy link
Author

This seems like a really cool PR! Is there any chance that it will get merged soon?

Thanks @erikvdp, meanwhile, you might have a look at https://github.com/ThomasBury/arfs implementing those and more (although I still think it'd best to integrate the changes related to boruta in the official boruta_py ^^)

@MauritsDescamps
Copy link

Any chance this will be merged? Would really like to try out Boruta with Shap feature importance

@ThomasBury
Copy link
Author

Any chance this will be merged? Would really like to try out Boruta with Shap feature importance

Hi @MauritsDescamps, I built the ARFS package to provide those features for Boruta (and much more). In the ARFS pkg, you'll find 3 different methods for performing all relevant feature selection. I called the evolution of Boruta: "Leshy" and it provides the features of this PR. There are notebooks that explain step by step how to use it and what are the differences.

you can test it by simply pip install -U arfs, there is a brand new release (version 1.0.2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants