You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue tracks various action items we would like to complete with regard to the features function calling and embeddings.
Function calling (beta)
We are calling it beta because multiple iterations may be needed for function calling. It may be hard to conform different open-source models' function calling formats to OpenAI API. We will try to make each iteration non-breaking.
There are new fields in the official OpenAI API, which we should support as well if possible.
This may limit the flexibility for the user. For instance, while Llama3.1 offers roughly 3 formats for function calling, using tools will force us to use only one of them
We want to allow the model to make tool calls or respond in natural language at its own discretion
F3 Use BNFGrammar to guarantee tool call generation correctness
This requires the model to use a special token to signify the beginning of a function call, <tool_call> in the case of Hermes2. Upon such a token being generated, we instantiate a BNFGrammar instance. When ended, force it to generate </tool_call>. Before and after this tool call, the model can either generate natural language or other tool calls.
For applications like RAG, two models are needed to complete this, one embedding model and one LLM. We'd like to hold all models in a single MLCEngine instead of instantiating multiple engines. This makes MLCEngine behave like an endpoint, and offers the possibility for intra-engine optimizations in the future.
The text was updated successfully, but these errors were encountered:
CharlieFRuan
changed the title
[Tracking][WebLLM] Function calling and Embeddings
[Tracking][WebLLM] Function calling (beta) and Embeddings
Aug 4, 2024
This issue tracks various action items we would like to complete with regard to the features function calling and embeddings.
Function calling (beta)
We are calling it beta because multiple iterations may be needed for function calling. It may be hard to conform different open-source models' function calling formats to OpenAI API. We will try to make each iteration non-breaking.
system
,user
,assistant
,tool
messages, without using thetools
andtool_calls
fields of OpenAI APItools
andtool_calls
tools
will force us to use only one of them<tool_call>
in the case of Hermes2. Upon such a token being generated, we instantiate a BNFGrammar instance. When ended, force it to generate</tool_call>
. Before and after this tool call, the model can either generate natural language or other tool calls.Embedding, Multi-model Engine, Concurrency
MLCEngine
instead of instantiating multiple engines. This makesMLCEngine
behave like an endpoint, and offers the possibility for intra-engine optimizations in the future.engine.embeddings.create()
(completed via [Embeddings][OpenAI] Support embeddings via engine.embeddings.create() #538, supported in npm 0.2.58)The text was updated successfully, but these errors were encountered: