ollama backend does not support api keys #106

Krakonos · 2024-09-18T20:54:15Z

Hi!

I'm trying to set up llm-ls via llm.nvim plugin and I'm hitting weird serdes errors. I set the following config:

{
  backend="openai",
  url = "http://192.168.0.61:8080/api",
  api_token = "sk-xxxx", -- replace after publishing outside local network

  lsp = {
    cmd_env = { LLM_LOG_LEVEL = "DEBUG" }
  },
}

Weird thing is, completion seems to work just fine. But after each completion, I get a serdes error per following log. Any idea what am I doing wrong? I find it hard to find docs on using open webui as a backend, as well as the open webui docs for their API (I believe it is supposed to be openai compatible, hence I set it so).

Relevant logs (api key stripped):

{"timestamp":"2024-09-18T20:44:02.195454Z","level":"INFO","message":"file:///home/krakonos/test.cpp changed","target":"llm_ls","line_number":682}
{"timestamp":"2024-09-18T20:44:02.195645Z","level":"DEBUG","message":"client asked to cancel request 4, but no such pending request exists, ignoring","target":"tower_lsp::service::pending","line_number":64}
{"timestamp":"2024-09-18T20:44:02.346897Z","level":"INFO","message":"received completion request","document_url":"file:///home/krakonos/test.cpp","cursor_line":"19","cursor_character":"49","language_id":"cpp","model":"deepseek-coder-v2:16b-lite-base-q4_0","backend":"OpenAi { url: \"http://192.168.0.61:8080/api\" }","ide":"neovim","request_body":"{\"temperature\":0.2,\"top_p\":0.95}","disable_url_path_completion":false,"target":"llm_ls","line_number":514,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347146Z","level":"INFO","message":"completion type: SingleLine","completion_type":"single_line","target":"llm_ls","line_number":537,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347201Z","level":"INFO","message":"built prompt in 0 ms","prompt":"<｜fim▁begin｜>#include <iostream>\n#include <memory>\n\n\n\nclass A {\n\n};\n\nclass B {\n\n};\n\nint foo() {\n\n}\n\nint main() {\n    std::shared_ptr<A> a = std::make_shared<A>();\n    std::shared_ptr<B> b = std::make_shared<B>();<｜fim▁hole｜>\n    return 0;\n}\n<｜fim▁end｜>","build_prompt_ms":"0","target":"llm_ls","line_number":204,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347232Z","level":"INFO","message":"sending request to backend","headers":"{\"user-agent\": \"llm-ls/0.5.2; rust/unknown; ide/Neovim\", \"authorization\": \"Bearer sk-xxxx\"}","url":"http://192.168.0.61:8080/api/v1/completions","target":"llm_ls","line_number":254,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347281Z","level":"DEBUG","message":"sending request to backend","headers":"{\"user-agent\": \"llm-ls/0.5.2; rust/unknown; ide/Neovim\", \"authorization\": \"Bearer sk-xxxx\"}","body":"{\"model\": String(\"deepseek-coder-v2:16b-lite-base-q4_0\"), \"prompt\": String(\"<｜fim▁begin｜>#include <iostream>\\n#include <memory>\\n\\n\\n\\nclass A {\\n\\n};\\n\\nclass B {\\n\\n};\\n\\nint foo() {\\n\\n}\\n\\nint main() {\\n    std::shared_ptr<A> a = std::make_shared<A>();\\n    std::shared_ptr<B> b = std::make_shared<B>();<｜fim▁hole｜>\\n    return 0;\\n}\\n<｜fim▁end｜>\"), \"stream\": Bool(false), \"temperature\": Number(0.2), \"top_p\": Number(0.95)}","url":"http://192.168.0.61:8080/api/v1/completions","target":"llm_ls","line_number":255,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347408Z","level":"DEBUG","message":"starting new connection: http://192.168.0.61:8080/","log.target":"reqwest::connect","log.module_path":"reqwest::connect","log.file":"/home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.11.20/src/connect.rs","log.line":429,"target":"reqwest::connect","line_number":429,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.347446Z","level":"DEBUG","message":"connecting to 192.168.0.61:8080","target":"hyper::client::connect::http","line_number":537,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.363434Z","level":"DEBUG","message":"connected to 192.168.0.61:8080","target":"hyper::client::connect::http","line_number":540,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.363816Z","level":"DEBUG","message":"flushed 637 bytes","target":"hyper::proto::h1::io","line_number":342}
{"timestamp":"2024-09-18T20:44:02.368285Z","level":"DEBUG","message":"parsed 5 headers","target":"hyper::proto::h1::io","line_number":207}
{"timestamp":"2024-09-18T20:44:02.368420Z","level":"DEBUG","message":"incoming body is content-length (22 bytes)","target":"hyper::proto::h1::conn","line_number":222}
{"timestamp":"2024-09-18T20:44:02.368492Z","level":"DEBUG","message":"incoming body completed","target":"hyper::proto::h1::conn","line_number":298}
{"timestamp":"2024-09-18T20:44:02.368619Z","level":"DEBUG","message":"pooling idle connection for (\"http\", 192.168.0.61:8080)","target":"hyper::client::pool","line_number":376,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:02.368860Z","level":"ERROR","err_msg":"serde json error: data did not match any variant of untagged enum OpenAIAPIResponse","target":"llm_ls::error","line_number":8,"spans":[{"request_id":"3d2389a7-a53a-4e09-a658-4095254c72d4","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.380607Z","level":"INFO","message":"file:///home/krakonos/test.cpp changed","target":"llm_ls","line_number":682}
{"timestamp":"2024-09-18T20:44:03.381212Z","level":"DEBUG","message":"client asked to cancel request 5, but no such pending request exists, ignoring","target":"tower_lsp::service::pending","line_number":64}
{"timestamp":"2024-09-18T20:44:03.531812Z","level":"INFO","message":"received completion request","document_url":"file:///home/krakonos/test.cpp","cursor_line":"20","cursor_character":"4","language_id":"cpp","model":"deepseek-coder-v2:16b-lite-base-q4_0","backend":"OpenAi { url: \"http://192.168.0.61:8080/api\" }","ide":"neovim","request_body":"{\"temperature\":0.2,\"top_p\":0.95}","disable_url_path_completion":false,"target":"llm_ls","line_number":514,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.531954Z","level":"INFO","message":"completion type: SingleLine","completion_type":"single_line","target":"llm_ls","line_number":537,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.532024Z","level":"INFO","message":"built prompt in 0 ms","prompt":"<｜fim▁begin｜>#include <iostream>\n#include <memory>\n\n\n\nclass A {\n\n};\n\nclass B {\n\n};\n\nint foo() {\n\n}\n\nint main() {\n    std::shared_ptr<A> a = std::make_shared<A>();\n    std::shared_ptr<B> b = std::make_shared<B>();\n    <｜fim▁hole｜>\n    return 0;\n}\n<｜fim▁end｜>","build_prompt_ms":"0","target":"llm_ls","line_number":204,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.532053Z","level":"INFO","message":"sending request to backend","headers":"{\"user-agent\": \"llm-ls/0.5.2; rust/unknown; ide/Neovim\", \"authorization\": \"Bearer sk-xxxx\"}","url":"http://192.168.0.61:8080/api/v1/completions","target":"llm_ls","line_number":254,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.532073Z","level":"DEBUG","message":"sending request to backend","headers":"{\"user-agent\": \"llm-ls/0.5.2; rust/unknown; ide/Neovim\", \"authorization\": \"Bearer sk-xxxx\"}","body":"{\"model\": String(\"deepseek-coder-v2:16b-lite-base-q4_0\"), \"prompt\": String(\"<｜fim▁begin｜>#include <iostream>\\n#include <memory>\\n\\n\\n\\nclass A {\\n\\n};\\n\\nclass B {\\n\\n};\\n\\nint foo() {\\n\\n}\\n\\nint main() {\\n    std::shared_ptr<A> a = std::make_shared<A>();\\n    std::shared_ptr<B> b = std::make_shared<B>();\\n    <｜fim▁hole｜>\\n    return 0;\\n}\\n<｜fim▁end｜>\"), \"stream\": Bool(false), \"temperature\": Number(0.2), \"top_p\": Number(0.95)}","url":"http://192.168.0.61:8080/api/v1/completions","target":"llm_ls","line_number":255,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.532185Z","level":"DEBUG","message":"reuse idle connection for (\"http\", 192.168.0.61:8080)","target":"hyper::client::pool","line_number":250,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.532446Z","level":"DEBUG","message":"flushed 643 bytes","target":"hyper::proto::h1::io","line_number":342}
{"timestamp":"2024-09-18T20:44:03.550477Z","level":"DEBUG","message":"parsed 5 headers","target":"hyper::proto::h1::io","line_number":207}
{"timestamp":"2024-09-18T20:44:03.550580Z","level":"DEBUG","message":"incoming body is content-length (22 bytes)","target":"hyper::proto::h1::conn","line_number":222}
{"timestamp":"2024-09-18T20:44:03.550620Z","level":"DEBUG","message":"incoming body completed","target":"hyper::proto::h1::conn","line_number":298}
{"timestamp":"2024-09-18T20:44:03.550722Z","level":"DEBUG","message":"pooling idle connection for (\"http\", 192.168.0.61:8080)","target":"hyper::client::pool","line_number":376,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}
{"timestamp":"2024-09-18T20:44:03.550849Z","level":"ERROR","err_msg":"serde json error: data did not match any variant of untagged enum OpenAIAPIResponse","target":"llm_ls::error","line_number":8,"spans":[{"request_id":"e26d9290-1619-4990-9df8-b8b665f7621d","name":"completion_request"}]}

... If you've got suggestions on a better alternative to open webui, I might consider it. But I want to open the server to a few friends and don't want the hassle of manually managing api keys for raw ollama instance (that works fine btw.).

Any help would be appreciated!

The text was updated successfully, but these errors were encountered:

Krakonos · 2024-09-25T21:12:53Z

Ok, so I made a bit of a progress. I found out the open webui is not really openai compatible as I thought. Not sure where I got that impression. I can use the /ollama endpoint to authenticate using a token and pass down my query. Right now, it boils down to:

url = "http://192.168.0.61:8080/ollama",        --doesn't work                                                                                                                           
url = "http://192.168.0.61:11434",   -- works

But hitting the API from a script seems to return exactly the same response:

#!/usr/bin/bash


curl 'http://192.168.0.61:8080/ollama/api/generate' \
  -H 'accept: */*' \
  -H 'authorization: Bearer sk-182affc34e0e43218580222cdd88ca06' \
  -H 'content-type: application/json' \
  --data-raw "{\"model\": \"codellama:7b-code\", \"options\": {\"temperature\":0.2, \"top_p\": 0.95}, \"prompt\":\"<PRE> #include <iostream>\\n#include <memory>\\n\\n\\n\\nclass A {\\n\\n};\\n\\nclass B {\\n\\n};\\n\\nint foo() {\\n\\n}\\n\\nint main() {\\n    std::shared_ptr<A> a = std::make_shared<A>();\\n    std::shared_ptr<B> b = std::make_shared<B>();\\n    std::cout << \\\"Hello, World!\\\" << std::endl;\\n    std::cout << a << std::endl;\\n    std::cout << b << std::endl;\\n     <SUF>\\n\\n\\n\\n\\n\\n\\n    return 0;\\n\\n}\\n <MID>\", \"stream\": false}"


echo ""
echo "===="
echo ""

curl 'http://192.168.0.61:11434/api/generate' \
  -H 'accept: */*' \
  -H 'authorization: Bearer sk-182affc34e0e43218580222cdd88ca06' \
  -H 'content-type: application/json' \
  --data-raw "{\"model\": \"codellama:7b-code\", \"options\": {\"temperature\":0.2, \"top_p\": 0.95}, \"prompt\":\"<PRE> #include <iostream>\\n#include <memory>\\n\\n\\n\\nclass A {\\n\\n};\\n\\nclass B {\\n\\n};\\n\\nint foo() {\\n\\n}\\n\\nint main() {\\n    std::shared_ptr<A> a = std::make_shared<A>();\\n    std::shared_ptr<B> b = std::make_shared<B>();\\n    std::cout << \\\"Hello, World!\\\" << std::endl;\\n    std::cout << a << std::endl;\\n    std::cout << b << std::endl;\\n     <SUF>\\n\\n\\n\\n\\n\\n\\n    return 0;\\n\\n}\\n <MID>\", \"stream\": false}"
  #--data-raw '{"stream":false,"model":"codellama:7b-code","messages":[{"role":"user","content":"who are you?"}]}'

(I copied the API endpoint URL and query body from llm-ls log). For some reason, when going through webui I get the unexpected error. Looking at the source, it appears that ollama backend does not pass down the api token, even if provided. This limits the use of open webui, which does a great job securing the ollama instance and allowing me to use my selfhosted llm server when traveling.

Krakonos changed the title ~~serdes errors against open webui~~ ollama backend does not support api keys Sep 26, 2024

Krakonos mentioned this issue Sep 26, 2024

feat: added api key support to ollama backend. #107

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ollama backend does not support api keys #106

ollama backend does not support api keys #106

Krakonos commented Sep 18, 2024

Krakonos commented Sep 25, 2024

ollama backend does not support api keys #106

ollama backend does not support api keys #106

Comments

Krakonos commented Sep 18, 2024

Krakonos commented Sep 25, 2024