OOM when using encode for token_embeddings #1813

bodin-e · 2023-01-19T17:10:59Z

I am running into OOM when applying encode for token embeddings on a large dataset.

Currently the solution to resolve OOM using the encode method (see #522 and #487) is only applicable to sentence embeddings, not token embeddings.

I have resolved the issue by generalizing the previous solution to also be applicable to token embeddings via the use of an added 'move_to_cpu' flag. Is their an alternative approach that I have missed? If not, and you agree with the changes, feel free to merge #1812.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM when using encode for token_embeddings #1813

OOM when using encode for token_embeddings #1813

bodin-e commented Jan 19, 2023

OOM when using encode for token_embeddings #1813

OOM when using encode for token_embeddings #1813

Comments

bodin-e commented Jan 19, 2023