Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heavy CQR object #57

Open
diegoglozano opened this issue Sep 26, 2024 · 1 comment
Open

Heavy CQR object #57

diegoglozano opened this issue Sep 26, 2024 · 1 comment
Assignees

Comments

@diegoglozano
Copy link

Hi! First of all, thank you for your work!
We are currently using your package, specifically, CQR class (using two xgboost inside a DualPredictor).

When I serialised the object using joblib, I realised that the object size was about 500MB.

Checking the code I discovered this line, where the data is saved as an attribute of the IdSplitter.

self._split = [(X_fit, y_fit, X_calib, y_calib)]

Is this an expected behavior? I fixed it by setting:

my_object.conformal_predictor.splitter._split = None

before saving the artifact.

@M-Mouhcine
Copy link
Collaborator

M-Mouhcine commented Sep 30, 2024

Hi @diegoglozano,

Thank you for your feedback.

We currently provide the method save for ConformalPredictor serialization. However, I understand that it may not address your concern, as it still serializes the data. To resolve this, I’ll introduce a flag argument that allows to specify whether the splitter should be saved or not.

In the meantime, your solution works well and can be used with no negative impact.

@M-Mouhcine M-Mouhcine self-assigned this Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants