Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BERT memory leak #125

Open
lpfgarcia opened this issue May 16, 2024 · 6 comments
Open

BERT memory leak #125

lpfgarcia opened this issue May 16, 2024 · 6 comments

Comments

@lpfgarcia
Copy link

Hello,

First, I would like to thank the developers for the experience of using hiclass. The library is very well developed, and the documentation is very comprehensive. I have two comments: one is a suggestion, and the other is a possible bug.

The suggestion is simple: could you include the FlatClassifier as a method? I saw that in some example notebooks. This would help users compare the different strategies.

The second comment relates to the BERT. Unfortunately, bert_sklearn keeps the models in memory (GPU or CPU). This makes it impractical to use for hierarchies of almost any size. Could you consider saving the models during the Hiclass fit stage and loading them during the predict stage?

Thank you!

@mirand863
Copy link
Collaborator

mirand863 commented Jun 3, 2024

Hi @lpfgarcia,

Thank you for the interest in HiClass.

The suggestion is simple: could you include the FlatClassifier as a method? I saw that in some example notebooks. This would help users compare the different strategies.

I believe this would be out of scope for the library, since its purpose is to implement local hierarchical classifiers. Besides, wouldn't this be just an import of scikit-learn models?

The second comment relates to the BERT. Unfortunately, bert_sklearn keeps the models in memory (GPU or CPU). This makes it impractical to use for hierarchies of almost any size. Could you consider saving the models during the Hiclass fit stage and loading them during the predict stage?

There is actually some code to store trained models on disk and reload them in case training is restarted, but I did not implement it with the goal of saving memory in mind. The code is here

def _save_tmp(self, name, classifier):
and here Would you be able to modify it to save memory with BERT and other models?

@lpfgarcia
Copy link
Author

Thanks @mirand863

If you're interested, I implemented the flat approach this way:

from sklearn.base import BaseEstimator

class FlatClassifier(BaseEstimator):

    def __init__(self, local_classifier):
        self.local_classifier = local_classifier 

    def fit(self, X, y):
        y = ["::HiClass::Separator::".join(i) for i in y]
        self.local_classifier.fit(X, y)
        return self
    
    def predict(self, X):
        return [i.split('::HiClass::Separator::') for i in self.local_classifier.predict(X)]

@mirand863
Copy link
Collaborator

Thanks @mirand863

If you're interested, I implemented the flat approach this way:

from sklearn.base import BaseEstimator

class FlatClassifier(BaseEstimator):

    def __init__(self, local_classifier):
        self.local_classifier = local_classifier 

    def fit(self, X, y):
        y = ["::HiClass::Separator::".join(i) for i in y]
        self.local_classifier.fit(X, y)
        return self
    
    def predict(self, X):
        return [i.split('::HiClass::Separator::') for i in self.local_classifier.predict(X)]

Hi @lpfgarcia,

I understand now what you mean. Sorry for the misunderstanding and thank you for clarifying. I will add this to the code base and put a comment to acknowledge your contribution, but if you would like your contribution to be properly acknowledged and listed on github you can open a pull request and I can review it.

Best regards,
Fabio

@lpfgarcia
Copy link
Author

Hi @mirand863

Very good! Feel free to add the code to the library.

Kind regards,
Luis

@mirand863
Copy link
Collaborator

Hi @lpfgarcia,

A quick update on this. I just added the flat classifier #128

I will try to tackle the problem with the memory leak on bert in the next days. My initial plan is to use the parameter tmp_dir or have something similar that stores the models directly to the disk and free it from memory after fit is successful. Please, let me know if you have any thoughts or different ideas.

Best regards,
Fabio

@lpfgarcia
Copy link
Author

Hi @mirand863 ,

Happy to hear about the addition of FlatClassifier.

Regarding the use of BERT with hiclass, I made a fork of the bert-classifier and a workaround in the code. Basically, after finishing the fit, I moved the model to the CPU instead of the GPU. This solved my problem.

Here is the link with the change: https://github.com/lpfgarcia/bert-sklearn/blob/b7cb9abcb123bdda743b2abc1ba70d7681276420/bert_sklearn/sklearn.py#L375

Kind regards,
Luis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants