-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/shap utils #391
base: main
Are you sure you want to change the base?
Feature/shap utils #391
Conversation
…rocedural approach.
…rimental searchspace representation.
…d plotting methods.
Hi @Alex6022, awesome that you gave it a shot 🎖️ I just returned from my vacation on the weekend. Let me have a thorough look at the code, exchange with @Scienfitz, and then we'll share with you our consolidated thoughts 👍🏼 |
…into feature/shap-utils
…into feature/shap-utils
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for taking long, busy days :) but we see your contribution 👍
I'm leaving some initial comments here (and theres also something for you in #357), as I took a first look now, I'll discuss the structure question with the others, some more requests especially regarding the main file will definitely come later
@@ -6,6 +6,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |||
|
|||
## [Unreleased] | |||
### Added | |||
- Added SHAP analysis within the new `diagnostics` package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Added SHAP analysis within the new `diagnostics` package. | |
- `diagnostics` dependency group | |
- SHAP explanations |
@@ -296,6 +296,7 @@ The available groups are: | |||
- `mypy`: Required for static type checking. | |||
- `onnx`: Required for using custom surrogate models in [ONNX format](https://onnx.ai). | |||
- `polars`: Required for optimized search space construction via [Polars](https://docs.pola.rs/) | |||
- `diagnostics`: Required for feature importance ranking via [SHAP](https://shap.readthedocs.io/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `diagnostics`: Required for feature importance ranking via [SHAP](https://shap.readthedocs.io/) | |
- `diagnostics`: Required for built-in model and campaign analysis, e.g. [SHAP](https://shap.readthedocs.io/) |
@@ -10,6 +10,8 @@ addopts = | |||
--ignore=baybe/_optional | |||
--ignore=baybe/utils/chemistry.py | |||
--ignore=tests/simulate_telemetry.py | |||
--ignore=baybe/utils/diagnostics.py | |||
--ignore=tests/utils/test_diagnostics.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems youre ignoring the tests you created? is this on purpose?
baybe/utils/diagnostics.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the diagnostics
module should not be a submodule of utils
, can you move it to its own folder with name diagnostics
? I suggest to rename this file into shap
. There also needs to be an __init__
which can import the important objects to enable user friendly import a la from baybe.diagnostics import explainer
while keeping the code structured into potentially separate files/blocks. For a recipe take a look at parameters
So we'd have
baybe/diangostics
baybe/diagnostics/shap.py
baybe/diagnostics/__init__.py
Dear @Scienfitz and @AdrianSosic,
As you previously offered in PR #335, I have taken a shot at integrating SHAP analysis amongst other explainers provided by the SHAP package. Similar to the Polars integration, it is provided as an optional dependency. As suggested, I have included tests, ensured it works with hybrid spaces and molecular encodings, implemented it as another utility and decoupled the computation and plotting methods. It also uses the exposed surrogate implemented in PR #355.
The Explanation object provided by the SHAP package can be created through
shap_kern = explanation(campaign)
This object can then be passed to the plotting functions wrapped from the SHAP package:
plot_beeswarm(shap_kern)
Additionally, this implementation allows users to selected whether to use the computational or experimental representation of the search space. E.g., while the experimental representation can give a good overview of the important parameters:
, the computational representation can give more advanced users the option to understand which encodings specifically predict the target:
Please let me know if you have any further input for improving this. Looking forward to it!