Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ollama backend does not support api keys #106

Open
Krakonos opened this issue Sep 18, 2024 · 1 comment
Open

ollama backend does not support api keys #106

Krakonos opened this issue Sep 18, 2024 · 1 comment

Comments

@Krakonos
Copy link

Hi!

I'm trying to set up llm-ls via llm.nvim plugin and I'm hitting weird serdes errors. I set the following config:

{
  backend="openai",
  url = "http://192.168.0.61:8080/api",
  api_token = "sk-xxxx", -- replace after publishing outside local network

  lsp = {
    cmd_env = { LLM_LOG_LEVEL = "DEBUG" }
  },
}

Weird thing is, completion seems to work just fine. But after each completion, I get a serdes error per following log. Any idea what am I doing wrong? I find it hard to find docs on using open webui as a backend, as well as the open webui docs for their API (I believe it is supposed to be openai compatible, hence I set it so).

Relevant logs (api key stripped):

{"timestamp":"2024-09-18T20:44:02.195454Z","level":"INFO","message":"file:///home/krakonos/test.cpp changed","target":"llm_ls","line_number":682}
{"timestamp":"2024-09-18T20:44:02.195645Z","level":"DEBUG","message":"client asked to cancel request 4, but no such pending request exists, ignoring","target":"tower_lsp::service::pending","line_number":64}
{"timestamp":"2024-09-18T20:44:02.346897Z","level":"INFO","message":"received completion request","document_url":"file:///home/krakonos/test.cpp","cursor_line":"19","cursor_character":"49","language_id":"cpp","model":"deepseek-coder-v2:16b-lite-base-q4_0","backend":"OpenAi { url: \"http://192.168.0.61:8080/api\" }","ide":"neovim","request_body":"{\"temperature\":0.2,\"top_p\":0.95}","disable_url_path_completion":false,"target":"llm_ls","line_number":514,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347146Z","level":"INFO","message":"completion type: SingleLine","completion_type":"single_line","target":"llm_ls","line_number":537,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347201Z","level":"INFO","message":"built prompt in 0 ms","prompt":"<|fim▁begin|>#include <iostream>\n#include <memory>\n\n\n\nclass A {\n\n};\n\nclass B {\n\n};\n\nint foo() {\n\n}\n\nint main() {\n    std::shared_ptr<A> a = std::make_shared<A>();\n    std::shared_ptr<B> b = std::make_shared<B>();<|fim▁hole|>\n    return 0;\n}\n<|fim▁end|>","build_prompt_ms":"0","target":"llm_ls","line_number":204,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347232Z","level":"INFO","message":"sending request to backend","headers":"{\"user-agent\": \"llm-ls/0.5.2; rust/unknown; ide/Neovim\", \"authorization\": \"Bearer sk-xxxx\"}","url":"http://192.168.0.61:8080/api/v1/completions","target":"llm_ls","line_number":254,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347281Z","level":"DEBUG","message":"sending request to backend","headers":"{\"user-agent\": \"llm-ls/0.5.2; rust/unknown; ide/Neovim\", \"authorization\": \"Bearer sk-xxxx\"}","body":"{\"model\": String(\"deepseek-coder-v2:16b-lite-base-q4_0\"), \"prompt\": String(\"<|fim▁begin|>#include <iostream>\\n#include <memory>\\n\\n\\n\\nclass A {\\n\\n};\\n\\nclass B {\\n\\n};\\n\\nint foo() {\\n\\n}\\n\\nint main() {\\n    std::shared_ptr<A> a = std::make_shared<A>();\\n    std::shared_ptr<B> b = std::make_shared<B>();<|fim▁hole|>\\n    return 0;\\n}\\n<|fim▁end|>\"), \"stream\": Bool(false), \"temperature\": Number(0.2), \"top_p\": Number(0.95)}","url":"http://192.168.0.61:8080/api/v1/completions","target":"llm_ls","line_number":255,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347408Z","level":"DEBUG","message":"starting new connection: http://192.168.0.61:8080/","log.target":"reqwest::connect","log.module_path":"reqwest::connect","log.file":"/home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.11.20/src/connect.rs","log.line":429,"target":"reqwest::connect","line_number":429,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347446Z","level":"DEBUG","message":"connecting to 192.168.0.61:8080","target":"hyper::client::connect::http","line_number":537,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.363434Z","level":"DEBUG","message":"connected to 192.168.0.61:8080","target":"hyper::client::connect::http","line_number":540,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.363816Z","level":"DEBUG","message":"flushed 637 bytes","target":"hyper::proto::h1::io","line_number":342}
{"timestamp":"2024-09-18T20:44:02.368285Z","level":"DEBUG","message":"parsed 5 headers","target":"hyper::proto::h1::io","line_number":207}
{"timestamp":"2024-09-18T20:44:02.368420Z","level":"DEBUG","message":"incoming body is content-length (22 bytes)","target":"hyper::proto::h1::conn","line_number":222}
{"timestamp":"2024-09-18T20:44:02.368492Z","level":"DEBUG","message":"incoming body completed","target":"hyper::proto::h1::conn","line_number":298}
{"timestamp":"2024-09-18T20:44:02.368619Z","level":"DEBUG","message":"pooling idle connection for (\"http\", 192.168.0.61:8080)","target":"hyper::client::pool","line_number":376,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.368860Z","level":"ERROR","err_msg":"serde json error: data did not match any variant of untagged enum OpenAIAPIResponse","target":"llm_ls::error","line_number":8,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.380607Z","level":"INFO","message":"file:///home/krakonos/test.cpp changed","target":"llm_ls","line_number":682}
{"timestamp":"2024-09-18T20:44:03.381212Z","level":"DEBUG","message":"client asked to cancel request 5, but no such pending request exists, ignoring","target":"tower_lsp::service::pending","line_number":64}
{"timestamp":"2024-09-18T20:44:03.531812Z","level":"INFO","message":"received completion request","document_url":"file:///home/krakonos/test.cpp","cursor_line":"20","cursor_character":"4","language_id":"cpp","model":"deepseek-coder-v2:16b-lite-base-q4_0","backend":"OpenAi { url: \"http://192.168.0.61:8080/api\" }","ide":"neovim","request_body":"{\"temperature\":0.2,\"top_p\":0.95}","disable_url_path_completion":false,"target":"llm_ls","line_number":514,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.531954Z","level":"INFO","message":"completion type: SingleLine","completion_type":"single_line","target":"llm_ls","line_number":537,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.532024Z","level":"INFO","message":"built prompt in 0 ms","prompt":"<|fim▁begin|>#include <iostream>\n#include <memory>\n\n\n\nclass A {\n\n};\n\nclass B {\n\n};\n\nint foo() {\n\n}\n\nint main() {\n    std::shared_ptr<A> a = std::make_shared<A>();\n    std::shared_ptr<B> b = std::make_shared<B>();\n    <|fim▁hole|>\n    return 0;\n}\n<|fim▁end|>","build_prompt_ms":"0","target":"llm_ls","line_number":204,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.532053Z","level":"INFO","message":"sending request to backend","headers":"{\"user-agent\": \"llm-ls/0.5.2; rust/unknown; ide/Neovim\", \"authorization\": \"Bearer sk-xxxx\"}","url":"http://192.168.0.61:8080/api/v1/completions","target":"llm_ls","line_number":254,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.532073Z","level":"DEBUG","message":"sending request to backend","headers":"{\"user-agent\": \"llm-ls/0.5.2; rust/unknown; ide/Neovim\", \"authorization\": \"Bearer sk-xxxx\"}","body":"{\"model\": String(\"deepseek-coder-v2:16b-lite-base-q4_0\"), \"prompt\": String(\"<|fim▁begin|>#include <iostream>\\n#include <memory>\\n\\n\\n\\nclass A {\\n\\n};\\n\\nclass B {\\n\\n};\\n\\nint foo() {\\n\\n}\\n\\nint main() {\\n    std::shared_ptr<A> a = std::make_shared<A>();\\n    std::shared_ptr<B> b = std::make_shared<B>();\\n    <|fim▁hole|>\\n    return 0;\\n}\\n<|fim▁end|>\"), \"stream\": Bool(false), \"temperature\": Number(0.2), \"top_p\": Number(0.95)}","url":"http://192.168.0.61:8080/api/v1/completions","target":"llm_ls","line_number":255,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.532185Z","level":"DEBUG","message":"reuse idle connection for (\"http\", 192.168.0.61:8080)","target":"hyper::client::pool","line_number":250,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.532446Z","level":"DEBUG","message":"flushed 643 bytes","target":"hyper::proto::h1::io","line_number":342}
{"timestamp":"2024-09-18T20:44:03.550477Z","level":"DEBUG","message":"parsed 5 headers","target":"hyper::proto::h1::io","line_number":207}
{"timestamp":"2024-09-18T20:44:03.550580Z","level":"DEBUG","message":"incoming body is content-length (22 bytes)","target":"hyper::proto::h1::conn","line_number":222}
{"timestamp":"2024-09-18T20:44:03.550620Z","level":"DEBUG","message":"incoming body completed","target":"hyper::proto::h1::conn","line_number":298}
{"timestamp":"2024-09-18T20:44:03.550722Z","level":"DEBUG","message":"pooling idle connection for (\"http\", 192.168.0.61:8080)","target":"hyper::client::pool","line_number":376,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.550849Z","level":"ERROR","err_msg":"serde json error: data did not match any variant of untagged enum OpenAIAPIResponse","target":"llm_ls::error","line_number":8,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}

... If you've got suggestions on a better alternative to open webui, I might consider it. But I want to open the server to a few friends and don't want the hassle of manually managing api keys for raw ollama instance (that works fine btw.).

Any help would be appreciated!

@Krakonos
Copy link
Author

Ok, so I made a bit of a progress. I found out the open webui is not really openai compatible as I thought. Not sure where I got that impression. I can use the /ollama endpoint to authenticate using a token and pass down my query. Right now, it boils down to:

url = "http://192.168.0.61:8080/ollama",        --doesn't work                                                                                                                           
url = "http://192.168.0.61:11434",   -- works

But hitting the API from a script seems to return exactly the same response:

#!/usr/bin/bash


curl 'http://192.168.0.61:8080/ollama/api/generate' \
  -H 'accept: */*' \
  -H 'authorization: Bearer sk-182affc34e0e43218580222cdd88ca06' \
  -H 'content-type: application/json' \
  --data-raw "{\"model\": \"codellama:7b-code\", \"options\": {\"temperature\":0.2, \"top_p\": 0.95}, \"prompt\":\"<PRE> #include <iostream>\\n#include <memory>\\n\\n\\n\\nclass A {\\n\\n};\\n\\nclass B {\\n\\n};\\n\\nint foo() {\\n\\n}\\n\\nint main() {\\n    std::shared_ptr<A> a = std::make_shared<A>();\\n    std::shared_ptr<B> b = std::make_shared<B>();\\n    std::cout << \\\"Hello, World!\\\" << std::endl;\\n    std::cout << a << std::endl;\\n    std::cout << b << std::endl;\\n     <SUF>\\n\\n\\n\\n\\n\\n\\n    return 0;\\n\\n}\\n <MID>\", \"stream\": false}"


echo ""
echo "===="
echo ""

curl 'http://192.168.0.61:11434/api/generate' \
  -H 'accept: */*' \
  -H 'authorization: Bearer sk-182affc34e0e43218580222cdd88ca06' \
  -H 'content-type: application/json' \
  --data-raw "{\"model\": \"codellama:7b-code\", \"options\": {\"temperature\":0.2, \"top_p\": 0.95}, \"prompt\":\"<PRE> #include <iostream>\\n#include <memory>\\n\\n\\n\\nclass A {\\n\\n};\\n\\nclass B {\\n\\n};\\n\\nint foo() {\\n\\n}\\n\\nint main() {\\n    std::shared_ptr<A> a = std::make_shared<A>();\\n    std::shared_ptr<B> b = std::make_shared<B>();\\n    std::cout << \\\"Hello, World!\\\" << std::endl;\\n    std::cout << a << std::endl;\\n    std::cout << b << std::endl;\\n     <SUF>\\n\\n\\n\\n\\n\\n\\n    return 0;\\n\\n}\\n <MID>\", \"stream\": false}"
  #--data-raw '{"stream":false,"model":"codellama:7b-code","messages":[{"role":"user","content":"who are you?"}]}'

(I copied the API endpoint URL and query body from llm-ls log). For some reason, when going through webui I get the unexpected error. Looking at the source, it appears that ollama backend does not pass down the api token, even if provided. This limits the use of open webui, which does a great job securing the ollama instance and allowing me to use my selfhosted llm server when traveling.

@Krakonos Krakonos changed the title serdes errors against open webui ollama backend does not support api keys Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant