Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API for Shapley value estimation #249

Open
popovstefan opened this issue Oct 7, 2022 · 3 comments
Open

API for Shapley value estimation #249

popovstefan opened this issue Oct 7, 2022 · 3 comments

Comments

@popovstefan
Copy link

I have a project where I would like to use a LightGBM model trained in Python do perform prediction on feature contributions (Shapley values), in the same manner as answered in this StackOverflow question:

Is this possible in the current version of this library?
I have gone through the documentation and various JPPML tutorials and I couldn't figure out a way how to do that. I have successfully trained, converted, and deployed a model in a Java app, but with it I can only predict probabilities (simple model inference).

@vruusmann
Copy link
Member

Is this possible in the current version of this library?

Shapley values are model evaluation-time phenomenon, not model training- or conversion-time phenomenon.

Therefore, the JPMML-LightGBM library needs no changes in this area.

Moving this issue to a more appropriate location.

@vruusmann vruusmann transferred this issue from jpmml/jpmml-lightgbm Oct 8, 2022
@vruusmann vruusmann changed the title Support for Shapley value prediction in LightGBM API for Shapley value estimation Oct 8, 2022
@vruusmann
Copy link
Member

There is a related project, which performs simple feature impact analysis with various tree ensemble methods (boosting, bagging):
https://github.com/vruusmann/rf_feature_impact

What's the canonical algorithm for estimating Shapley values?

Ideally, the predicted value of the target field could implement some marker interface(s), which would trigger the computation of Shapley values in situ. The Pythonic approach where every prediction aspect (eg. predict, predict_proba, shap) involves running the whole prediction again from scratch seems kind of wasteful.

@04pallav
Copy link

04pallav commented Sep 6, 2024

@vruusmann if there is a pmml (.xml file) with preprocessor + model. Is there a way to use the pmml file to only produce the preprocessed data and not the final prediction? (only apply the transforms - something similar to sklearn-pipeline.transform())

More context- not necessary for you to read - I am trying to use Pmml & shap library together. TreeExplainer in shap library needs the actual sklearn Tree classes. if using pmml i can get preprocessed data - i can pass that to model object in shap library. I was hoping there would be some way to convert pmml back to sklearn Pipeline but probably thats not possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants