✨ [Tasks] JSON Schema spec for Inference types + TS type generation #449

SBrandeis · 2024-01-19T15:10:21Z

TL;DR

Introduces specification for inference input/output types for each task, in JSON Schema format
Add a "script" to generate typescript code from that JSON spec using quicktype-core
- Introduces quicktype-core as a dev dependency (from our fork of quicktype https://github.com/huggingface/quicktype/releases/tag/pack-18.0.15)
- License: Apache-2.0

TODO

Add a text2text-generation task to serve as a "canonical reference" for summarization & translation
Add a text-to-audio task to server as a "canonical reference" for text-to-speech
Code generation tweaks
- Change any types to unknown
- Generate output types for sentence-similarity
- Generate output types for feature-extraction -> Let's do that later?
add placeholder specs

SBrandeis · 2024-01-19T15:10:55Z

Ping @coyotte508 for visibility

julien-c

looks 😍😍😍

osanseviero

Nice!

osanseviero · 2024-01-19T15:37:22Z

packages/tasks/src/tasks/audio-classification/inference.ts

+/**
+ * Inputs for Audio Classification inference
+ */
+export interface AudioClassificationInput {


Audio data is usually pass through data

https://huggingface.co/docs/api-inference/detailed_parameters#audio-classification-task

Re-flagging this comment in case it was lost

Same for images. The jsonschema cannot specify this since sending as raw data and sending as json are 2 different things. So for now it's kind of a blind spot. If we provide an openapi schema for our APIs in the future, then it will be possible to document it. Openapi easily integrates with jsonschema so having them is already a first good step.

(difference between a jsonschema as in this PR and an openapi description is that this PR describes objects with their attributes while the openapi description with include stuff like server routes, accepted headers, etc.)

(^ only my understanding of the specs, anyone feel free to correct me 😄)

Yes - sorry for the delay in answering

Leaving the image/audio data as unknown was intentional, to give more flexibility to the libraries.
Image & audio data can be passed in several different forms (raw binary data, path to a local or remote file, base64 encoded data...) and I did not want to constrain downstream users of those types into one single representation.

(difference between a jsonschema as in this PR and an openapi description is that this PR describes objects with their attributes while the openapi description with include stuff like server routes, accepted headers, etc.)

Yes that is correct, there will be some additional work necessary to generate an OpenAPI spec for an inference API (including actually specifying how we expect the binary data to be represented)

…nslation

SBrandeis · 2024-01-23T10:27:27Z

packages/tasks/src/tasks/index.ts

@@ -216,6 +216,7 @@ export interface TaskData {
 	datasets: ExampleRepo[];
 	demo: TaskDemo;
 	id: PipelineType;
+	canonicalId?: PipelineType;


Added this property to express one task being a "subtask" of another (eg, summarization being a subtask of text2text-generation)

SBrandeis · 2024-01-26T11:51:12Z

I added a "post-process" script using the typescript API to generate the appropriate array type while glideapps/quicktype#2481 is being handled

osanseviero · 2024-01-26T14:45:59Z

packages/tasks/src/tasks/audio-classification/inference.ts

+	/**
+	 * The function to apply to the model outputs in order to retrieve the scores.
+	 */
+	functionToApply?: AudioClassificationOutputTransform;


this is not supported by any library afaik

packages/tasks/src/tasks/automatic-speech-recognition/inference.ts

osanseviero · 2024-01-26T14:53:09Z

packages/tasks/src/tasks/depth-estimation/spec/output.json

+	"items": {
+		"description": "The output depth labels"
+	}


Iirc, the output is a dictionary with two entries, one being the depth which is a depth estimation image, the other is predicted_depth, which is the tensor. See https://huggingface.co/docs/transformers/main/tasks/monocular_depth_estimation

osanseviero · 2024-01-26T14:54:50Z

packages/tasks/src/tasks/depth-estimation/spec/output.json

+	"items": {
+		"description": "The output depth labels"
+	}


huggingface/transformers#28729

osanseviero · 2024-01-26T15:00:31Z

packages/tasks/src/tasks/document-question-answering/inference.ts

+	/**
+	 * The answer to the question.
+	 */
+	answer: string;
+	end: number;
+	/**
+	 * The probability associated to the answer.
+	 */
+	score: number;
+	start: number;
+	/**
+	 * The index of each word/box pair that is in the answer
+	 */


I guess the alphabetical order is a bit weird with the docstrings. We have The answer to the question., then answer, then end, much later start.

/** * The answer to the question. */ answer: string; end: number; /** * The probability associated to the answer. */ score: number; start: number; /**

osanseviero · 2024-01-26T15:02:09Z

packages/tasks/src/tasks/feature-extraction/inference.ts

+	parameters?: { [key: string]: unknown };
+	[property: string]: unknown;


None of this would work out of the box in sentence_transformers API, but I guess we can add later on if needed

osanseviero · 2024-01-26T15:04:35Z

packages/tasks/src/tasks/feature-extraction/spec/output.json

+	"$id": "/inference/schemas/feature-extraction/output.json",
+	"$schema": "http://json-schema.org/draft-06/schema#",
+	"description": "The embedding for the input text, as a nested list (tensor) of floats",
+	"type": "array",


Note: it's an array in sentence transformers (one embedding per input), a list within a list in transformers (one embedding per token), and a list within a list within a list in Inference API (for batching) iirc

https://github.com/huggingface/api-inference-community/blob/main/docker_images/sentence_transformers/app/pipelines/feature_extraction.py#L25

https://github.com/huggingface/transformers/blob/main/src/transformers/pipelines/feature_extraction.py#L96

osanseviero · 2024-01-26T15:19:13Z

packages/tasks/src/tasks/sentence-similarity/spec/output.json

@@ -0,0 +1,12 @@
+{


Note that this one is not exported

packages/tasks/src/tasks/table-question-answering/inference.ts

osanseviero · 2024-01-26T15:37:45Z

packages/tasks/src/tasks/text-to-audio/inference.ts

+	/**
+	 * Parametrization of the text generation process
+	 */
+	generate?: GenerationParameters;


We should also support forward params so we can pass things such as speaker_embeddings in SpeechT5 https://huggingface.co/microsoft/speecht5_tts

Let's do it as a follow-up

packages/tasks/src/tasks/text-to-audio/inference.ts

osanseviero · 2024-01-26T15:38:18Z

packages/tasks/src/tasks/text-to-audio/inference.ts

+	/**
+	 * I can be the papa you'd be the mama
+	 */
+	temperature?: number;


Should we add the others?

I have added a bunch in 826181a - there are still a lot of other parameters to add

packages/tasks/src/tasks/text-to-image/inference.ts

osanseviero · 2024-01-26T15:42:37Z

packages/tasks/src/tasks/text2text-generation/inference.ts

@@ -0,0 +1,53 @@
+/**
+ * Inference code generated from the JSON schema spec in ./spec


Should we use this opportunity to unify text-generation and text2text-generation?

Yes, probably

osanseviero · 2024-01-26T15:43:45Z

packages/tasks/src/tasks/token-classification/inference.ts

+	/**
+	 * The strategy used to fuse tokens based on model predictions
+	 */
+	aggregationStrategy?: TokenClassificationAggregationStrategy;


A bit strange as this is actually a load parameter, not an inference parameter - see https://huggingface.co/docs/transformers/main/en/main_classes/pipelines#transformers.TextToAudioPipeline

Ah you're right - but shouldn't be supported by the call method too?

I would expect yes, but maybe it changes how the model is loaded?

packages/tasks/src/tasks/video-classification/inference.ts

packages/tasks/src/tasks/visual-question-answering/inference.ts

packages/tasks/src/tasks/zero-shot-object-detection/spec/input.json

Wauplin · 2024-01-26T17:01:05Z

whoop whoop 🚀

Follow up to #449 Review with whitespaces off

SBrandeis added 5 commits January 19, 2024 15:29

add JSON schema spec for audio-classification

7c50482

add JSON schema spec for text-generation

fd98112

✨ Add script to generate inference types

352e7c5

Add generated code

5551f5b

💄format with pnpm

fad594b

julien-c approved these changes Jan 19, 2024

View reviewed changes

osanseviero reviewed Jan 19, 2024

View reviewed changes

SBrandeis added 21 commits January 19, 2024 16:54

misc fix

9a8f327

✨ Add specs for existing tasks

02ba10c

🩹 Ignore placeholder when generating code

93c37f5

🩹 Fix: ensure spec files exist

bbf72ec

✨ Generate inference types for existing tasks

16a9beb

✨ Support cross-file references

b27846c

regen following header change

7d9a9f6

✨ Add text2text-generation task & reference it from summarization/tra…

dbd0254

…nslation

♻️ Use $id, $defs & title

6d90348

✨ Add sentence similarity task spec

d027115

fix typo in text2text-generation spec

224c039

regenerate code

b8dae86

Have text-to-speech refer to text-to-audio

b84825e

regenerate code

4484e39

Add quicktype-core from fork

a9c9ae1

regenerate code

f9fd4f9

💄format with pnpm

d4ec535

Add canonicalId to TaskData

00501a6

Fix naming for bounding boxes types

29fecc0

♻️ Better names for intermediate types

d220a9b

✨ Update placeholder

49a1d50

SBrandeis commented Jan 23, 2024

View reviewed changes

SBrandeis force-pushed the tasks-specifications branch from 1b9c6e2 to 6b10c4d Compare January 26, 2024 11:38

SBrandeis added 5 commits January 26, 2024 12:55

e📝 Some comments

c35fe85

💄 Lint

6f1a8b3

Add text-to-image pipeline

9d25d28

Update image-to-image output

499ed5f

Update image-to-image inputs

bf48f5e

osanseviero reviewed Jan 26, 2024

View reviewed changes

Factorize generate parameters

49a8151

SBrandeis force-pushed the tasks-specifications branch from 399f484 to 49a8151 Compare January 26, 2024 15:34

Correclty type ASR output

e4f3d13

osanseviero reviewed Jan 26, 2024

View reviewed changes

SBrandeis added 7 commits January 26, 2024 17:03

wip: spec generate parameters

826181a

e♻️ Factorize common classification types

0000f02

fix: await writefile in post process

8dc4d17

add scheduler param

9ccb3a4

rename schema-utls to common-definitions

accdeff

proper type for table QA

3a3d4ba

oops I forgot to commit the new file after rename

4742c9e

SBrandeis mentioned this pull request Jan 26, 2024

[Tasks] Spec: follow-up to #449 #463

Open

11 tasks

SBrandeis requested review from Wauplin, osanseviero and coyotte508 January 26, 2024 16:35

SBrandeis merged commit b70ac6c into main Jan 26, 2024
2 checks passed

SBrandeis deleted the tasks-specifications branch January 26, 2024 16:59

coyotte508 mentioned this pull request Jan 29, 2024

👷 Changes to build inference types #464

Merged

coyotte508 added a commit that referenced this pull request Feb 5, 2024

👷 Changes to build inference types (#464)

79ae82f

Follow up to #449 Review with whitespaces off

Wauplin mentioned this pull request Oct 10, 2024

Sync token-classification pipeline with Hub spec huggingface/transformers#34064

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ [Tasks] JSON Schema spec for Inference types + TS type generation #449

✨ [Tasks] JSON Schema spec for Inference types + TS type generation #449

SBrandeis commented Jan 19, 2024 •

edited

Loading

SBrandeis commented Jan 19, 2024

julien-c left a comment

osanseviero left a comment

osanseviero Jan 19, 2024

osanseviero Jan 23, 2024

Wauplin Jan 23, 2024 •

edited

Loading

SBrandeis Jan 23, 2024

SBrandeis Jan 23, 2024

SBrandeis commented Jan 26, 2024

osanseviero Jan 26, 2024

osanseviero Jan 26, 2024

osanseviero Jan 26, 2024

osanseviero Jan 26, 2024

osanseviero Jan 26, 2024

osanseviero Jan 26, 2024

osanseviero Jan 26, 2024

osanseviero Jan 26, 2024

SBrandeis Jan 26, 2024

osanseviero Jan 26, 2024

SBrandeis Jan 26, 2024

osanseviero Jan 26, 2024

SBrandeis Jan 26, 2024

osanseviero Jan 26, 2024

SBrandeis Jan 26, 2024

osanseviero Jan 26, 2024 •

edited

Loading

Wauplin commented Jan 26, 2024

		parameters?: { [key: string]: unknown };
		[property: string]: unknown;

		@@ -0,0 +1,53 @@
		/**
		* Inference code generated from the JSON schema spec in ./spec

✨ [Tasks] JSON Schema spec for Inference types + TS type generation #449

✨ [Tasks] JSON Schema spec for Inference types + TS type generation #449

Conversation

SBrandeis commented Jan 19, 2024 • edited Loading

TL;DR

TODO

SBrandeis commented Jan 19, 2024

julien-c left a comment

Choose a reason for hiding this comment

osanseviero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wauplin Jan 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SBrandeis commented Jan 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

osanseviero Jan 26, 2024 • edited Loading

Choose a reason for hiding this comment

Wauplin commented Jan 26, 2024

SBrandeis commented Jan 19, 2024 •

edited

Loading

Wauplin Jan 23, 2024 •

edited

Loading

osanseviero Jan 26, 2024 •

edited

Loading