Feature/shap utils #391

Alex6022 · 2024-10-04T12:35:59Z

As you previously offered in PR #335, I have taken a shot at integrating SHAP analysis amongst other explainers provided by the SHAP package. Similar to the Polars integration, it is provided as an optional dependency. As suggested, I have included tests, ensured it works with hybrid spaces and molecular encodings, implemented it as another utility and decoupled the computation and plotting methods. It also uses the exposed surrogate implemented in PR #355.

The Explanation object provided by the SHAP package can be created through
shap_kern = explanation(campaign)

This object can then be passed to the plotting functions wrapped from the SHAP package:
plot_beeswarm(shap_kern)

Additionally, this implementation allows users to selected whether to use the computational or experimental representation of the search space. E.g., while the experimental representation can give a good overview of the important parameters:

, the computational representation can give more advanced users the option to understand which encodings specifically predict the target:

Please let me know if you have any further input for improving this. Looking forward to it!

…rocedural approach.

…rimental searchspace representation.

…d plotting methods.

…n SHAP package.

AdrianSosic · 2024-10-07T12:20:29Z

Hi @Alex6022, awesome that you gave it a shot 🎖️ I just returned from my vacation on the weekend. Let me have a thorough look at the code, exchange with @Scienfitz, and then we'll share with you our consolidated thoughts 👍🏼

…into feature/shap-utils

Scienfitz

sorry for taking long, busy days :) but we see your contribution 👍
I'm leaving some initial comments here (and theres also something for you in #357), as I took a first look now, I'll discuss the structure question with the others, some more requests especially regarding the main file will definitely come later

Scienfitz · 2024-11-01T11:28:50Z

CHANGELOG.md

@@ -6,6 +6,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ## [Unreleased]
 ### Added
+- Added SHAP analysis within the new `diagnostics` package.


Suggested change

- Added SHAP analysis within the new `diagnostics` package.

- `diagnostics` dependency group

- SHAP explanations

Scienfitz · 2024-11-01T11:30:47Z

README.md

@@ -296,6 +296,7 @@ The available groups are:
 - `mypy`: Required for static type checking.
 - `onnx`: Required for using custom surrogate models in [ONNX format](https://onnx.ai).
 - `polars`: Required for optimized search space construction via [Polars](https://docs.pola.rs/)
+- `diagnostics`: Required for feature importance ranking via [SHAP](https://shap.readthedocs.io/)


Suggested change

- `diagnostics`: Required for feature importance ranking via [SHAP](https://shap.readthedocs.io/)

- `diagnostics`: Required for built-in model and campaign analysis, e.g. [SHAP](https://shap.readthedocs.io/)

Scienfitz · 2024-11-01T11:31:43Z

pytest.ini

@@ -10,6 +10,8 @@ addopts =
    --ignore=baybe/_optional
    --ignore=baybe/utils/chemistry.py
    --ignore=tests/simulate_telemetry.py
+    --ignore=baybe/utils/diagnostics.py
+    --ignore=tests/utils/test_diagnostics.py


seems youre ignoring the tests you created? is this on purpose?

Scienfitz · 2024-11-01T15:59:04Z

baybe/utils/diagnostics.py

the diagnostics module should not be a submodule of utils, can you move it to its own folder with name diagnostics? I suggest to rename this file into shap. There also needs to be an __init__ which can import the important objects to enable user friendly import a la from baybe.diagnostics import explainer while keeping the code structured into potentially separate files/blocks. For a recipe take a look at parameters

So we'd have

baybe/diangostics

baybe/diagnostics/shap.py

baybe/diagnostics/__init__.py

Alex6022 and others added 10 commits September 29, 2024 20:26

Optional import of shap package.

0c8e945

1st implementation of SHAP utilities in experimental space and with p…

cbe2e82

…rocedural approach.

Implementation option to perform SHAP either in computational or expe…

2597fd4

…rimental searchspace representation.

SHAP package implementation in diagnostics utility, complete tests an…

bc1203e

…d plotting methods.

Tests for explainer utilities and generalization for all explainers i…

b348b46

…n SHAP package.

Implemented plotting with non-shap attributions.

ae20322

Refactored diagnostics test and optimized handling of maple explainers.

de9d1e9

Shortened plotting method names.

e183957

Merge branch 'emdgroup:main' into feature/shap-utils

85fb9ba

Cleanup for PR

c389ac1

Alex6022 requested review from Scienfitz, AdrianSosic and AVHopp as code owners October 4, 2024 12:36

This was referenced Oct 4, 2024

Upcoming Diagnostics Package #357

Open

Shapley values #335

Closed

AdrianSosic mentioned this pull request Oct 7, 2024

Expose surrogate #355

Merged

Alex6022 and others added 11 commits October 23, 2024 22:57

Renamed diangostics package, enabled optional shap import

55e723c

Merge branch 'emdgroup:main' into feature/shap-utils

1922467

Refactoring of test_diagnostics.py

ee57008

Merge branch 'emdgroup:main' into feature/shap-utils

11b61d1

Merge branch 'feature/shap-utils' of https://github.com/Alex6022/baybe …

08a4c1e

…into feature/shap-utils

Merge branch 'feature/shap-utils' of https://github.com/Alex6022/baybe …

50846f7

…into feature/shap-utils

Fixed changelog merging error

103a5f7

Update pyproject.toml

ffda991

Rework import flag

eaa5c38

Update mypy.ini

4ca9ffd

Rework tests

9fddbdd

Scienfitz requested changes Nov 1, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/shap utils #391

Feature/shap utils #391

Alex6022 commented Oct 4, 2024

AdrianSosic commented Oct 7, 2024

Scienfitz left a comment

Scienfitz Nov 1, 2024

Scienfitz Nov 1, 2024

Scienfitz Nov 1, 2024

Scienfitz Nov 1, 2024

	- Added SHAP analysis within the new `diagnostics` package.
	- `diagnostics` dependency group
	- SHAP explanations

	- `diagnostics`: Required for feature importance ranking via [SHAP](https://shap.readthedocs.io/)
	- `diagnostics`: Required for built-in model and campaign analysis, e.g. [SHAP](https://shap.readthedocs.io/)

Feature/shap utils #391

Are you sure you want to change the base?

Feature/shap utils #391

Conversation

Alex6022 commented Oct 4, 2024

AdrianSosic commented Oct 7, 2024

Scienfitz left a comment

Choose a reason for hiding this comment

Scienfitz Nov 1, 2024

Choose a reason for hiding this comment

Scienfitz Nov 1, 2024

Choose a reason for hiding this comment

Scienfitz Nov 1, 2024

Choose a reason for hiding this comment

Scienfitz Nov 1, 2024

Choose a reason for hiding this comment