From 603febbb210ec4c47d9c0b37eae01f063dbe0da8 Mon Sep 17 00:00:00 2001 From: Alan Guo Date: Fri, 26 Jan 2024 10:50:40 -0800 Subject: [PATCH] Add more details about prompt format in the docs (#1292) Trying to make it easier for users to self-service add custom models to use with ray-llm. Cherry-pick of ray-project/ray-llm#126 --------- Signed-off-by: Alan Guo --- models/README.md | 36 +++++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/models/README.md b/models/README.md index 11ad02d8..5999c3f6 100644 --- a/models/README.md +++ b/models/README.md @@ -74,6 +74,40 @@ A prompt format is used to convert a chat completions API input into a prompt to The string template should include the `{instruction}` keyword, which will be replaced with message content from the ChatCompletions API. +For example, if a user sends the following message for llama2-7b-chat-hf ([prompt format](continuous_batching/meta-llama--Llama-2-7b-chat-hf.yaml#L27-L33)): +```json +{ + "messages": [ + { + "role": "system", + "content": "You are a helpful assistant." + }, + { + "role": "user", + "content": "What is the capital of France?" + }, + { + "role": "assistant", + "content": "The capital of France is Paris." + }, + { + "role": "user", + "content": "What about Germany?" + } + ] +} +``` +The generated prompt that is sent to the LLM engine will be: +``` +[INST] <> +You are a helpful assistant. +<> + +What is the capital of France? [/INST] The capital of France is Paris. [INST] What about Germany? [/INST] +``` + +##### Schema + The following keys are supported: * `system` - The system message. This is a message inserted at the beginning of the prompt to provide instructions for the LLM. * `assistant` - The assistant message. These messages are from the past turns of the assistant as defined in the list of messages provided in the ChatCompletions API. @@ -87,7 +121,7 @@ In addition, there some configurations to control the prompt formatting behavior * `strip_whitespace` - Whether to automatically strip whitespace from left and right of the content for the messages provided in the ChatCompletions API. -You can see an example in the [Adding a new model](#adding-a-new-model) section below. +You can see config in the [Adding a new model](#adding-a-new-model) section below. ### Scaling config