Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added binary classification support to MAPIE using the mondrian conformal predictor #230

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

adamzenith
Copy link

@adamzenith adamzenith commented Nov 2, 2022

Description

Mapie was unable to perform confidence estimation on binary classification problems. To address this issue, I have implemented the mondrian conformal as a method of the MapieClassifier. This method is described in detail on page 5 of this paper, but in essence it uses the quantiles of each class to determine inclusion in the prediction set, as opposed to one quantile found from both classes. This method is not constrained to binary classification, and should work for imbalanced multiclass problems as well.

Closes #216

Type of change

Please remove options that are irrelevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Ran the make test command, and all tests passed

I also tested the changes on a workflow where mapie was included, and altering the method to mondrian gave similar results.

Checklist

  • I have read the contributing guidelines
  • I have updated the HISTORY.rst and AUTHORS.rst files
  • Linting passes successfully : make lint
  • Typing passes successfully : make type-check
  • Unit tests pass successfully : make tests
  • Coverage is 100% : make coverage
  • Documentation builds successfully : make doc

I do unfortunately not have much experience writing tests, and do not know the best way to do this, so if anyone can assist on that front with help or advice I would be grateful.

@adamzenith adamzenith marked this pull request as ready for review November 23, 2022 08:17
@adamzenith
Copy link
Author

Hey, I have tried to read the error logs of the failed tests, but they seem to be failing at the level of installing numpy, and I cannot see a reason as to why this happens. If anyone has some advice so that I can fix it that would be much appreciated.

@CihanDogan94
Copy link

Can we get this merged? would be super useful

@thibaultcordier
Copy link
Collaborator

Hello @adamzenith,

Thank you for submitting your pull request to propose the Mondrian Conformal Predictor. I have read with interest your modifications and proposals to implement this method. I hope I understood correctly and that the correction elements I bring you will be relevant. Don't hesitate to share your feedback with me!

1. Your PR in a nutshell

You have proposed an implementation of the Mondrian Conformal Predictor as a method of the MapieClassifier.

  • The goal of this method is to ensure a conditional coverage of $1-\alpha$ for each class by computing the $1-\alpha$ quantile of the conformal scores for each class to determine their inclusion in the prediction set.
  • As you stated, this method is not limited to binary classification and should also work for unbalanced multiclass problems.

2. Our feedback on the PR

We believe that the Mondrian Conformal Predictor could be a good enhancement in MAPIE as it has been mentioned and popularized in related work on drug discovery. However, at this time, we lack evidence for comparison with existing methods in MAPIE as proof of the compelling value of using this method in specific use cases. We need concrete examples (in jupyter notebooks for example) that demonstrate that Mondrian Conformal Predictor is better than other methods in MAPIE for solving binary or unbalanced multi-class problems. This will be a demonstration not only for us but for all MAPIE users. We invite you to consult the existing notebooks to help you.

3. Additional comments to improve your code

My suggestions are about modifications to make your code as generic as possible.

  • I noticed that you have added elements that work specifically on your development settings and are therefore not intended for generic use in MAPIE (as in .gitignore and Makefile). This is not a problem in itself for code execution, but we prefer to keep the code as generic as possible.

  • In the same vein, you have adapted the compute_quantiles function with a new parameter named mondrian (in the utils.py file). Even if exceptions exist, we prefer to continue to implement generic functions that do not depend on external method attributes, especially when these choices impact the size and shape of the output (since as many quantiles as classes are computed when mondrian=True, whereas only one quantile is computed with mondrian=False).

4. Actions to be taken

I propose a list of actions to help you improve your proposal and help us integrate it into MAPIE:

  • Propose a notebook that demonstrates the value of using the Mondrian Conformal Predictor in place of the other multi-class conformal predictors proposed in MapieClassifier.
  • Delete non generic settings in .gitignore and Makefile files (related to your virtual environment mapieenv).
  • Implement a new function named compute_class_quantiles which performs the same function as compute_quantiles with the parameter mondrian=True in the utils.py file.
  • Correct typing errors (such as line breaks in mapie/classification.py).

I remain available if you have any questions and thank you in advance for your feedback.

@GabeNicholson
Copy link

I tested this out myself and it worked well. Nice job.

@CoteDave
Copy link

CoteDave commented Oct 9, 2023

Would be very helpfull!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Binary classification
5 participants