-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ollama: adding support for num_ctx parameter based on max_input_tokens #806
Conversation
In this case, both Separate (but related) -- the I recommend that if
Alternately, we can use Let me know your thoughts and I'll do the bit fiddling. Thanks! |
My apologies. Ollama is already aware of the Lines 236 to 238 in 11022f8
The parameter |
Usually, we don't need to pass The currently available solution is using patch. patch:
chat_completions:
'llama3.1':
body:
options:
num_ctx: 131072 |
Ok, cool. I can use patch in the meantime, while you're sorting out the details. There are quite a few other Ollama specific parameters, and I'm happy to help if you decide to expand those capabilities. Cheers! |
ollama/ollama#6504 (comment) The ollama author says that Also, we've decided to deprecate the So we decided not to derive Suggest setting Thank you for your contribution. |
Oh snap, that's great. Thank you! |
would be nice if you let us have a config file somewhere for server settings, without fiddling with environment variables right now the server overrides it stil to 2k context window unless the modelfile is specified with the right parameters (even if you ollama run MODEL and then /set parameters) especially painful when using a close source program which uses the ollama localhost, but has its own stupid max token settings etc |
It does what it says on the tin.
Ollama uses the
num_ctx
option to specify the maximum context length for a request. If absent, ollama (v 0.3.6) defaults to 2048 tokens, regardless of what an aichat user specifies asmax_input_tokens
in their local configuration.