Adding ignore_eos_token support in Chat Completions API Schema #3387

jiahong-liu · 2024-08-06T00:00:30Z

Description

ignore_eos_token is commonly used additional parameter to help standardize LLM benchmarks by forcing the requests to generate a consistent output seq len.

-Will this change the current api? How?

It will be adding the ignore_eos_token as additional optional field in the request body.

-Who will benefit from this enhancement?

Anyone who is trying to do benchmark or gain a better understanding of the performance

References

https://docs.djl.ai/master/docs/serving/serving/docs/lmi/user_guides/lmi_input_output_schema.html.
The same feature is already supported in the "Additional LMI Dist Generation parameters" and "Additional vLLM Generation Parameters". "Additional TensorRT-LLM Generation Parameters" also has flag of 'min_length', achieving similar behavior.

lanking520 · 2024-08-06T00:18:07Z

@sindhuvahinis

jiahong-liu added the enhancement New feature or request label Aug 6, 2024

lanking520 assigned sindhuvahinis Aug 6, 2024

siddvenk mentioned this issue Aug 6, 2024

add ignore_eos support in chat completions schema deepjavalibrary/djl-serving#2281

Merged

sindhuvahinis assigned siddvenk and unassigned sindhuvahinis Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding ignore_eos_token support in Chat Completions API Schema #3387

Adding ignore_eos_token support in Chat Completions API Schema #3387

jiahong-liu commented Aug 6, 2024 •

edited by frankfliu

Loading

lanking520 commented Aug 6, 2024

Adding ignore_eos_token support in Chat Completions API Schema #3387

Adding ignore_eos_token support in Chat Completions API Schema #3387

Comments

jiahong-liu commented Aug 6, 2024 • edited by frankfliu Loading

Description

References

lanking520 commented Aug 6, 2024

jiahong-liu commented Aug 6, 2024 •

edited by frankfliu

Loading