Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/shap utils #391

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Conversation

Alex6022
Copy link

@Alex6022 Alex6022 commented Oct 4, 2024

Dear @Scienfitz and @AdrianSosic,

As you previously offered in PR #335, I have taken a shot at integrating SHAP analysis amongst other explainers provided by the SHAP package. Similar to the Polars integration, it is provided as an optional dependency. As suggested, I have included tests, ensured it works with hybrid spaces and molecular encodings, implemented it as another utility and decoupled the computation and plotting methods. It also uses the exposed surrogate implemented in PR #355.

The Explanation object provided by the SHAP package can be created through
shap_kern = explanation(campaign)

This object can then be passed to the plotting functions wrapped from the SHAP package:
plot_beeswarm(shap_kern)

Additionally, this implementation allows users to selected whether to use the computational or experimental representation of the search space. E.g., while the experimental representation can give a good overview of the important parameters:
image

, the computational representation can give more advanced users the option to understand which encodings specifically predict the target:
image

Please let me know if you have any further input for improving this. Looking forward to it!

@AdrianSosic
Copy link
Collaborator

Hi @Alex6022, awesome that you gave it a shot 🎖️ I just returned from my vacation on the weekend. Let me have a thorough look at the code, exchange with @Scienfitz, and then we'll share with you our consolidated thoughts 👍🏼

@AdrianSosic AdrianSosic mentioned this pull request Oct 7, 2024
Copy link
Collaborator

@Scienfitz Scienfitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for taking long, busy days :) but we see your contribution 👍
I'm leaving some initial comments here (and theres also something for you in #357), as I took a first look now, I'll discuss the structure question with the others, some more requests especially regarding the main file will definitely come later

@@ -6,6 +6,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]
### Added
- Added SHAP analysis within the new `diagnostics` package.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Added SHAP analysis within the new `diagnostics` package.
- `diagnostics` dependency group
- SHAP explanations

@@ -296,6 +296,7 @@ The available groups are:
- `mypy`: Required for static type checking.
- `onnx`: Required for using custom surrogate models in [ONNX format](https://onnx.ai).
- `polars`: Required for optimized search space construction via [Polars](https://docs.pola.rs/)
- `diagnostics`: Required for feature importance ranking via [SHAP](https://shap.readthedocs.io/)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `diagnostics`: Required for feature importance ranking via [SHAP](https://shap.readthedocs.io/)
- `diagnostics`: Required for built-in model and campaign analysis, e.g. [SHAP](https://shap.readthedocs.io/)

@@ -10,6 +10,8 @@ addopts =
--ignore=baybe/_optional
--ignore=baybe/utils/chemistry.py
--ignore=tests/simulate_telemetry.py
--ignore=baybe/utils/diagnostics.py
--ignore=tests/utils/test_diagnostics.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems youre ignoring the tests you created? is this on purpose?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the diagnostics module should not be a submodule of utils, can you move it to its own folder with name diagnostics? I suggest to rename this file into shap. There also needs to be an __init__ which can import the important objects to enable user friendly import a la from baybe.diagnostics import explainer while keeping the code structured into potentially separate files/blocks. For a recipe take a look at parameters

So we'd have

  • baybe/diangostics
  • baybe/diagnostics/shap.py
  • baybe/diagnostics/__init__.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants