bnb optimizers could use bnb.nn.StableEmbedding instead of torch.nn.Embedding #1769

mtasic85 · 2024-10-03T18:53:57Z

According to bnb documentation here:

https://huggingface.co/docs/bitsandbytes/main/optimizers
https://huggingface.co/docs/bitsandbytes/main/explanations/optimizers#stable-embedding-layer

This line could alter between bnb.nn.StableEmbedding and torch.nn.Embedding, or allow it to be configurable in config file:

litgpt/litgpt/model.py

Line 28 in a8aa4ba

wte=nn.Embedding(config.padded_vocab_size, config.n_embd),

There are also other places in code where torch.nn.Embedding is used.

rasbt · 2024-10-03T19:00:30Z

Thanks for the note and good point, I didn't know about this.

One challenge I see with configuring it in the config file is that it's used to model creation. But one can later optionally run with --quantize bnb.nf4 or not. So, ideally, that swap should only take place upon calling the inference/training functions and leave the original model as is.

rasbt · 2024-10-04T13:06:08Z

Upon reading a bit more, this would only be required for training (due to the optimizer choice). I added it in #1770

mtasic85 added the question Further information is requested label Oct 3, 2024

rasbt added enhancement New feature or request and removed question Further information is requested labels Oct 3, 2024

rasbt mentioned this issue Oct 4, 2024

Add bnb.nn.StableEmbedding for quantized training #1770

Merged

rasbt closed this as completed in #1770 Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bnb optimizers could use bnb.nn.StableEmbedding instead of torch.nn.Embedding #1769

bnb optimizers could use bnb.nn.StableEmbedding instead of torch.nn.Embedding #1769

mtasic85 commented Oct 3, 2024

rasbt commented Oct 3, 2024 •

edited

Loading

rasbt commented Oct 4, 2024

bnb optimizers could use bnb.nn.StableEmbedding instead of torch.nn.Embedding #1769

bnb optimizers could use bnb.nn.StableEmbedding instead of torch.nn.Embedding #1769

Comments

mtasic85 commented Oct 3, 2024

rasbt commented Oct 3, 2024 • edited Loading

rasbt commented Oct 4, 2024

rasbt commented Oct 3, 2024 •

edited

Loading