Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MambaSSM as a library #802

Merged
merged 5 commits into from
Jul 16, 2024
Merged

Add MambaSSM as a library #802

merged 5 commits into from
Jul 16, 2024

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Jul 16, 2024

This PR adds https://github.com/state-spaces/mamba as an official library on the Hub.
Let's wait for Mamba integration to be merged (coming soon) + having a few first models uploaded on the Hub.

cc @osanseviero @NielsRogge @tridao

Related PR on Mamba side: state-spaces/mamba#471

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if naming is agreed

packages/tasks/src/model-libraries.ts Outdated Show resolved Hide resolved
@@ -298,6 +298,13 @@ export const MODEL_LIBRARIES_UI_ELEMENTS = {
repoName: "mindspore",
repoUrl: "https://github.com/mindspore-ai/mindspore",
},
mamba_ssm: {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just mamba would be cleaner, branding-wise?

Suggested change
mamba_ssm: {
mamba: {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it's pip install mamba-ssm on PyPI so maybe a better option too

I think i'd go with mamba personally

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this with @osanseviero. The problem of simply mamba is that it's a transformers architecture as well so it might create some conflicts. Also having mamba_ssm as library_name and mamba as tag helps with code snippets (to import Mamba class).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have Mamba as prettyName but it might be misleading

Copy link
Member

@julien-c julien-c Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok but then dash rather than underscore no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm let's not complicate stuff, LGTM as is :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, addressed in e3e7e62 and state-spaces/mamba@961eccb

@Wauplin
Copy link
Contributor Author

Wauplin commented Jul 16, 2024

@osanseviero @julien-c I've updated the code snippet. In the end only MambaLMHeadModel is loadable from the Hub since it's a model. Mamba, Mamba2 and Mamba2Simple are simply layers. I've also updated state-spaces/mamba#471 accordingly.

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 🔥

@osanseviero osanseviero merged commit c73abf9 into main Jul 16, 2024
5 checks passed
@osanseviero osanseviero deleted the add-mamba-ssm-library branch July 16, 2024 17:12
@Wauplin
Copy link
Contributor Author

Wauplin commented Jul 17, 2024

(I would have waited for state-spaces/mamba#471 to be merged but no big deal)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants