You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ignore_eos_token is commonly used additional parameter to help standardize LLM benchmarks by forcing the requests to generate a consistent output seq len.
-Will this change the current api? How?
It will be adding the ignore_eos_token as additional optional field in the request body.
-Who will benefit from this enhancement?
Anyone who is trying to do benchmark or gain a better understanding of the performance
Description
ignore_eos_token is commonly used additional parameter to help standardize LLM benchmarks by forcing the requests to generate a consistent output seq len.
-Will this change the current api? How?
It will be adding the ignore_eos_token as additional optional field in the request body.
-Who will benefit from this enhancement?
Anyone who is trying to do benchmark or gain a better understanding of the performance
References
The same feature is already supported in the "Additional LMI Dist Generation parameters" and "Additional vLLM Generation Parameters". "Additional TensorRT-LLM Generation Parameters" also has flag of 'min_length', achieving similar behavior.
The text was updated successfully, but these errors were encountered: