Skip to content

Commit

Permalink
Update conversational widget to use text-generation (+ remove `conver…
Browse files Browse the repository at this point in the history
…sational` task) (#457)

Done as part of huggingface-internal/moon-landing#8578.
Should be merged before (or at the same time) as
huggingface-internal/moon-landing#8723. This is only a
first draft to check if we have everything we need.

From huggingface-internal/moon-landing#8578:

> In huggingface.js and api-inference
> - [ ] Models that are secondary tagged as `conversational` will get
the `ConversationalWidget`
> - [ ] The `ConversationalWidget` will call the `text-generation` API
under the hood. The widget needs to take care of all prompt formatting
(using the recent jinja work in `huggingface.js`)
> - [ ] Should we just kill the conversational API in the inference API
with the APIs unification?

> This would break use cases such as
`pipeline("microsoft/DialoGPT-medium")` in `transformers`
>
> Result:
> * All models with conversational capabilities will have a nice widget
> * We eliminate the fragmentation of tasks (conversational vs text
generation)
> * We remove the confusing conv pipeline

Currently in this PR:
- ✔️ _Models that are secondary tagged as conversational
will get the ConversationalWidget_
- ✔️ _The `ConversationalWidget` will call the
`text-generation` API under the hood._ (automatic in inference API if
`pipeline_tag` gets updated by
huggingface-internal/moon-landing#8723)
- ✔️ _The widget needs to take care of all prompt
formatting_ (not complete)

cc @xenova @osanseviero @SBrandeis @coyotte508 

---

Still unsure how to proceed:
- how to handle the transition period? => **EDIT:** no transition period
- what to do if we don't have a `chat_template`? => **EDIT:** raise
error
- what if we have a `chat_template` but no `eos_token` / `bos_token`? =>
**EDIT:** should be ok
- should we keep the `Conversation` structure in the widget (with
`generated_responses` / `past_user_inputs` / `generated_text`) ? If not,
would need more svelte expertise 😄 => **EDIT:** ok

---------

Co-authored-by: Joshua Lochner <[email protected]>
Co-authored-by: Julien Chaumond <[email protected]>
Co-authored-by: Simon Brandeis <[email protected]>
  • Loading branch information
4 people authored Feb 20, 2024
1 parent 705588e commit 802e164
Show file tree
Hide file tree
Showing 24 changed files with 176 additions and 323 deletions.
1 change: 0 additions & 1 deletion packages/inference/src/tasks/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ export * from "./cv/imageToImage";
export * from "./cv/zeroShotImageClassification";

// Natural Language Processing tasks
export * from "./nlp/conversational";
export * from "./nlp/featureExtraction";
export * from "./nlp/fillMask";
export * from "./nlp/questionAnswering";
Expand Down
81 changes: 0 additions & 81 deletions packages/inference/src/tasks/nlp/conversational.ts

This file was deleted.

19 changes: 0 additions & 19 deletions packages/inference/test/HfInference.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -333,25 +333,6 @@ describe.concurrent(
])
);
});
it("conversational", async () => {
expect(
await hf.conversational({
model: "microsoft/DialoGPT-large",
inputs: {
past_user_inputs: ["Which movie is the best ?"],
generated_responses: ["It is Die Hard for sure."],
text: "Can you explain why ?",
},
})
).toMatchObject({
generated_text: "It's the best movie ever.",
conversation: {
past_user_inputs: ["Which movie is the best ?", "Can you explain why ?"],
generated_responses: ["It is Die Hard for sure.", "It's the best movie ever."],
},
warnings: ["Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation."],
});
});
it("SentenceSimilarity", async () => {
expect(
await hf.sentenceSimilarity({
Expand Down
4 changes: 2 additions & 2 deletions packages/tasks/src/default-widget-inputs.ts
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
import type { WidgetExample } from "./widget-example";
import type { PipelineType } from "./pipelines";
import type { WidgetType } from "./pipelines";

type LanguageCode = string;

type PerLanguageMapping = Map<PipelineType, string[] | WidgetExample[]>;
type PerLanguageMapping = Map<WidgetType, string[] | WidgetExample[]>;

/// NOTE TO CONTRIBUTORS:
///
Expand Down
3 changes: 3 additions & 0 deletions packages/tasks/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ export * from "./tasks";
export {
PIPELINE_DATA,
PIPELINE_TYPES,
type WidgetType,
type PipelineType,
type PipelineData,
type Modality,
Expand All @@ -16,6 +17,7 @@ export {
export { ALL_DISPLAY_MODEL_LIBRARY_KEYS, ALL_MODEL_LIBRARY_KEYS, MODEL_LIBRARIES_UI_ELEMENTS } from "./model-libraries";
export type { LibraryUiElement, ModelLibraryKey } from "./model-libraries";
export type { ModelData, TransformersInfo } from "./model-data";
export type { SpecialTokensMap, TokenizerConfig } from "./tokenizer-data";
export type {
WidgetExample,
WidgetExampleAttribute,
Expand All @@ -37,6 +39,7 @@ export type {
WidgetExampleOutputText,
} from "./widget-example";
export { InferenceDisplayability } from "./model-data";
export { SPECIAL_TOKENS_ATTRIBUTES } from "./tokenizer-data";

import * as snippets from "./snippets";
export { snippets };
2 changes: 1 addition & 1 deletion packages/tasks/src/library-to-tasks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ export const LIBRARY_TASK_MAPPING_EXCLUDING_TRANSFORMERS: Partial<Record<ModelLi
keras: ["image-classification"],
nemo: ["automatic-speech-recognition"],
open_clip: ["zero-shot-classification", "zero-shot-image-classification"],
paddlenlp: ["conversational", "fill-mask", "summarization", "zero-shot-classification"],
paddlenlp: ["fill-mask", "summarization", "zero-shot-classification"],
peft: ["text-generation"],
"pyannote-audio": ["automatic-speech-recognition"],
"sentence-transformers": ["feature-extraction", "sentence-similarity"],
Expand Down
2 changes: 2 additions & 0 deletions packages/tasks/src/model-data.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import type { PipelineType } from "./pipelines";
import type { WidgetExample } from "./widget-example";
import type { TokenizerConfig } from "./tokenizer-data";

export enum InferenceDisplayability {
/**
Expand Down Expand Up @@ -53,6 +54,7 @@ export interface ModelData {
base_model_name?: string;
task_type?: string;
};
tokenizer?: TokenizerConfig;
};
/**
* all the model tags
Expand Down
21 changes: 10 additions & 11 deletions packages/tasks/src/pipelines.ts
Original file line number Diff line number Diff line change
Expand Up @@ -225,17 +225,6 @@ export const PIPELINE_DATA = {
modality: "nlp",
color: "indigo",
},
conversational: {
name: "Conversational",
subtasks: [
{
type: "dialogue-generation",
name: "Dialogue Generation",
},
],
modality: "nlp",
color: "green",
},
"feature-extraction": {
name: "Feature Extraction",
modality: "nlp",
Expand All @@ -248,6 +237,14 @@ export const PIPELINE_DATA = {
type: "dialogue-modeling",
name: "Dialogue Modeling",
},
{
type: "dialogue-generation",
name: "Dialogue Generation",
},
{
type: "conversational",
name: "Conversational",
},
{
type: "language-modeling",
name: "Language Modeling",
Expand Down Expand Up @@ -667,6 +664,8 @@ export const PIPELINE_DATA = {

export type PipelineType = keyof typeof PIPELINE_DATA;

export type WidgetType = PipelineType | "conversational";

export const PIPELINE_TYPES = Object.keys(PIPELINE_DATA) as PipelineType[];

export const SUBTASK_TYPES = Object.values(PIPELINE_DATA)
Expand Down
1 change: 0 additions & 1 deletion packages/tasks/src/snippets/curl.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ export const curlSnippets: Partial<Record<PipelineType, (model: ModelData, acces
"zero-shot-classification": snippetZeroShotClassification,
translation: snippetBasic,
summarization: snippetBasic,
conversational: snippetBasic,
"feature-extraction": snippetBasic,
"text-generation": snippetBasic,
"text2text-generation": snippetBasic,
Expand Down
8 changes: 0 additions & 8 deletions packages/tasks/src/snippets/inputs.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,6 @@ const inputsTranslation = () => `"Меня зовут Вольфганг и я
const inputsSummarization = () =>
`"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."`;

const inputsConversational = () =>
`{
"past_user_inputs": ["Which movie is the best ?"],
"generated_responses": ["It is Die Hard for sure."],
"text": "Can you explain why ?"
}`;

const inputsTableQuestionAnswering = () =>
`{
"query": "How many stars does the transformers repository have?",
Expand Down Expand Up @@ -96,7 +89,6 @@ const modelInputSnippets: {
"audio-to-audio": inputsAudioToAudio,
"audio-classification": inputsAudioClassification,
"automatic-speech-recognition": inputsAutomaticSpeechRecognition,
conversational: inputsConversational,
"document-question-answering": inputsVisualQuestionAnswering,
"feature-extraction": inputsFeatureExtraction,
"fill-mask": inputsFillMask,
Expand Down
1 change: 0 additions & 1 deletion packages/tasks/src/snippets/js.ts
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,6 @@ export const jsSnippets: Partial<Record<PipelineType, (model: ModelData, accessT
"zero-shot-classification": snippetZeroShotClassification,
translation: snippetBasic,
summarization: snippetBasic,
conversational: snippetBasic,
"feature-extraction": snippetBasic,
"text-generation": snippetBasic,
"text2text-generation": snippetBasic,
Expand Down
1 change: 0 additions & 1 deletion packages/tasks/src/snippets/python.ts
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,6 @@ export const pythonSnippets: Partial<Record<PipelineType, (model: ModelData) =>
"zero-shot-classification": snippetZeroShotClassification,
translation: snippetBasic,
summarization: snippetBasic,
conversational: snippetBasic,
"feature-extraction": snippetBasic,
"text-generation": snippetBasic,
"text2text-generation": snippetBasic,
Expand Down
50 changes: 0 additions & 50 deletions packages/tasks/src/tasks/conversational/about.md

This file was deleted.

66 changes: 0 additions & 66 deletions packages/tasks/src/tasks/conversational/data.ts

This file was deleted.

Loading

0 comments on commit 802e164

Please sign in to comment.