Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update conversational widget to use text-generation (+ remove conversational task) #457

Merged
merged 43 commits into from
Feb 20, 2024

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Jan 25, 2024

Done as part of https://github.com/huggingface/moon-landing/issues/8578. Should be merged before (or at the same time) as https://github.com/huggingface/moon-landing/pull/8723. This is only a first draft to check if we have everything we need.

From https://github.com/huggingface/moon-landing/issues/8578:

In huggingface.js and api-inference

  • Models that are secondary tagged as conversational will get the ConversationalWidget
  • The ConversationalWidget will call the text-generation API under the hood. The widget needs to take care of all prompt formatting (using the recent jinja work in huggingface.js)
  • Should we just kill the conversational API in the inference API with the APIs unification?

This would break use cases such as pipeline("microsoft/DialoGPT-medium") in transformers

Result:

  • All models with conversational capabilities will have a nice widget
  • We eliminate the fragmentation of tasks (conversational vs text generation)
  • We remove the confusing conv pipeline

Currently in this PR:

  • ✔️ Models that are secondary tagged as conversational will get the ConversationalWidget
  • ✔️ The ConversationalWidget will call the text-generation API under the hood. (automatic in inference API if pipeline_tag gets updated by https://github.com/huggingface/moon-landing/pull/8723)
  • ✔️ The widget needs to take care of all prompt formatting (not complete)

cc @xenova @osanseviero @SBrandeis @coyotte508


Still unsure how to proceed:

  • how to handle the transition period? => EDIT: no transition period
  • what to do if we don't have a chat_template? => EDIT: raise error
  • what if we have a chat_template but no eos_token / bos_token? => EDIT: should be ok
  • should we keep the Conversation structure in the widget (with generated_responses / past_user_inputs / generated_text) ? If not, would need more svelte expertise 😄 => EDIT: ok

@xenova
Copy link
Contributor

xenova commented Jan 25, 2024

(my first commits just fixed some of the dependencies to @huggingface/jinja).

Let me try to answer (or at least provide discussion for) your questions:

how to handle the transition period?

We'll get this PR ready before merging https://github.com/huggingface/moon-landing/issues/8578, and then hopefully there should be no "transition period" (each text-generation model will be assigned a conversational or text-generation widget). Sure, there might be issues with compatibility, but we'll add some checks to default back to previous functionality if something is missing (like bos_token or something).

what to do if we don't have a chat_template?

The way transformers.js handles this, is that (like the transformers library) we define the default chat templates associated for each model type (e.g., here for llama). If no chat_template is found and the user tries to run apply_chat_template, we would use the default one. However, for widgets, I think it's best that we only set it to conversational if chat_template is defined in the tokenizer_config.json.

what if we have a chat_template but no eos_token / bos_token?

This is most likely due to an outdated tokenizer_config.json (like this), but is also possible if the model doesn't have a tokenizer_config.json. Fortunately, when saving newer tokenizers, the required special tokens are saved. However, in cases where something is not defined, we can either (1) use pre-defined defaults, or (2) disable the widget, or (3) default to text-generation.

should we keep the Conversation structure in the widget (with generated_responses / past_user_inputs / generated_text) ? If not, would need more svelte expertise 😄

I think we will need to convert it to the ChatML format (JSON array of messages, each with at least role and content values), since this is how the templates expect the conversation to be provided.

@xenova
Copy link
Contributor

xenova commented Jan 25, 2024

We also need to decide how users should define system prompts/messages. This might be something we need to pass in from the yaml in the model card (i.e., in the same way as examples are specified)

SBrandeis added a commit that referenced this pull request Feb 12, 2024
# TL;DR

- Update `text-generation` spec to match TGI API
- ~Add `conversational` spec, heavily inspired by TGI messages API (cc
@Wauplin @osanseviero @Narsil )~
- ~Relevant related work:
#457 &
huggingface-internal/moon-landing#8723
 - regenerate typescript code for those tasks
@Wauplin
Copy link
Contributor Author

Wauplin commented Feb 12, 2024

I've updated this PR with feedback and improved the logic to handle chat format.
@xenova is finishing the templating part (discussed in dm) and after that we should be good to try / review this PR.

cc @julien-c @osanseviero I have also merged work from #479 that will remove the conversational task from hf.co/models + define the WidgetType.

@Wauplin Wauplin changed the title [WIP] Prepare conversational widget to use text-generation [WIP] Switch conversational widget to use text-generation Feb 12, 2024
@Wauplin Wauplin changed the title [WIP] Switch conversational widget to use text-generation [WIP] Update conversational widget to use text-generation Feb 12, 2024
model.pipeline_tag && model.pipeline_tag in WIDGET_COMPONENTS
? WIDGET_COMPONENTS[model.pipeline_tag as keyof typeof WIDGET_COMPONENTS]
: undefined;
model.pipeline_tag === "text-generation" && model.tags?.includes("conversational")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also model.pipeline_tag === text2text-generation btw @osanseviero no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also moot if we merge text2text-generation into text-generation like i'd like to 😁)

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking nice!


## Useful Resources

- Learn how ChatGPT and InstructGPT work in this blog: [Illustrating Reinforcement Learning from Human Feedback (RLHF)](https://huggingface.co/blog/rlhf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @merveenoyan as we'll want to move some of this to text-generation task page

model.pipeline_tag && model.pipeline_tag in WIDGET_COMPONENTS
? WIDGET_COMPONENTS[model.pipeline_tag as keyof typeof WIDGET_COMPONENTS]
: undefined;
model.pipeline_tag === "text-generation" && model.tags?.includes("conversational")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@SBrandeis SBrandeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff 🔥

@xenova
Copy link
Contributor

xenova commented Feb 16, 2024

I also just added some error checking to the compilation and render steps of the chat template:

Invalid chat templates

The error message is supplied by @huggingface/jinja. For example:

  • {{ test

    image

  • {% for %}

    image

Valid template, but error during runtime

(To test, I just deleted bos_token)

image

Finally, I updated it so that we only add message on first call (not also if model loading response received). Otherwise we get:

image

Copy link
Contributor

@xenova xenova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to go!
chat-widget

small nit: maybe we should clear the text input once we hit enter? (maybe only necessary once we add streaming)

Copy link
Contributor Author

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't approve my own PR but I reviewed @xenova last changes. Thanks for making the extra step ❤️ Everything looks good for a merge now :)

Will merge it on tomorrow if no-one complains until then (not sure anymore who should/want to review it one more time 😄 )

@osanseviero
Copy link
Contributor

🚀 🚀

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on my side i'll check in prod 🙂

@Wauplin Wauplin merged commit 802e164 into main Feb 20, 2024
2 checks passed
@Wauplin Wauplin deleted the 8578-switch-conversational-to-text-generation branch February 20, 2024 09:31
@Wauplin
Copy link
Contributor Author

Wauplin commented Feb 20, 2024

And... finally merged! 😄 🚀

input: string;
response: string;
export let messages: Array<{
role: string;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we make the typing even stronger here?

role: "user" | "assistant" | "system"

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants