Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free space allocated for LM in the memory after evaluation finishes #2277

Closed
wants to merge 2 commits into from

Conversation

ahmedamrelhefnawy
Copy link

@ahmedamrelhefnawy ahmedamrelhefnawy commented Sep 3, 2024

Deletes lm object after simple_evaluate function finishes

Then, memory space can be easily freed as following:

  1. Free cache using Pytorch or TensorFlow (Depending on the model):
  • tf.keras.backend.clear_session() for TensorFlow
  • torch.cuda.empty_cache() for Pytorch
  1. Garbage Collector
  • import gc
  • gc.collect()

Important when multiple models are tested sequentially.

@CLAassistant
Copy link

CLAassistant commented Sep 3, 2024

CLA assistant check
All committers have signed the CLA.

@ahmedamrelhefnawy ahmedamrelhefnawy marked this pull request as draft September 4, 2024 07:41
@ahmedamrelhefnawy ahmedamrelhefnawy marked this pull request as ready for review September 4, 2024 07:43
Copy link
Collaborator

@haileyschoelkopf haileyschoelkopf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Thanks for this PR! We're aiming to put out v0.4.4 on PyPI this week but will try to review this ASAP.

The sole blocker that comes to mind when doing this is that it may disrupt the usage of the library in which one starts with an already-initialized model, wraps that existing model in an HFLM class, and passes that to simple_evaluate--we'd want to make sure that that does not delete the original initialized model, for those running lm_eval within their training loops. The code would look something like this:

...
my_hf_model = AutoModelForCausalLM.from_pretrained(...)

lm_obj = HFLM(pretrained=my_hf_model)

results = lm_eval.simple_evaluate( 
    model=lm_obj,
    tasks=["taskname1", "taskname2"],
    ...
)

# we should now still be able to use my_hf_model and not free it
my_hf_model.generate(...)

(see https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/interface.md#external-library-usage)

if you would have the bandwidth to test this case (and perhaps add a test, though that wouldn't be required) then that'd make it a lot easier to approve and merge this right away! Else, we'll look at testing that setting ourselves.

@ahmedamrelhefnawy
Copy link
Author

ahmedamrelhefnawy commented Sep 4, 2024

Hi! Thanks for this PR! We're aiming to put out v0.4.4 on PyPI this week but will try to review this ASAP.

The sole blocker that comes to mind when doing this is that it may disrupt the usage of the library in which one starts with an already-initialized model, wraps that existing model in an HFLM class, and passes that to simple_evaluate--we'd want to make sure that that does not delete the original initialized model, for those running lm_eval within their training loops. The code would look something like this:

...
my_hf_model = AutoModelForCausalLM.from_pretrained(...)

lm_obj = HFLM(pretrained=my_hf_model)

results = lm_eval.simple_evaluate( 
    model=lm_obj,
    tasks=["taskname1", "taskname2"],
    ...
)

# we should now still be able to use my_hf_model and not free it
my_hf_model.generate(...)

(see https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/interface.md#external-library-usage)

if you would have the bandwidth to test this case (and perhaps add a test, though that wouldn't be required) then that'd make it a lot easier to approve and merge this right away! Else, we'll look at testing that setting ourselves.

Thanks for your attention,
I tested it and it worked.
However, I can put del lm in an if condition that checks if the attribute model was an instance of lm_eval.api.model.LM
if it was, lm object won't be deleted.

1
2
3
4
5
6

@ahmedamrelhefnawy
Copy link
Author

ahmedamrelhefnawy commented Sep 4, 2024

Note: There's a problem with deleting already-initialized models after evaluation.
Despite deleting my_hf_model, my_hf_tokenizer, and ml_obj, and clearing the cache multiple times, the memory remains occupied.
I also tested this with the main repository (yours) to check if the issue existed there as well, and found the same behavior.
It appears to be a longstanding problem, so I'm documenting it here in case it can be addressed or resolved in future updates.

Applies at fb23b3f where the del lm was without any if conditions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants