Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ollama with llama3.1 not working #13

Open
gavinblair opened this issue Sep 17, 2024 · 8 comments
Open

Ollama with llama3.1 not working #13

gavinblair opened this issue Sep 17, 2024 · 8 comments
Labels
question Further information is requested

Comments

@gavinblair
Copy link

gavinblair commented Sep 17, 2024

Here is the output I get, running with Ollama locally (just the example from the README)

Starting orchestrator
Browser started and ready
Executing command play shape of you on youtube
==================================================
Current State: agentq_base
Agent: sentient
Current Thought:
Plan: none
Completed Tasks: none
==================================================
Error executing the command play shape of you on youtube: RetryError[<Future at 0x10fd8d090 state=finished raised ValidationError>]
@nischalj10
Copy link
Member

nischalj10 commented Sep 17, 2024

hey @gavinblair - this primarily stems from the fact that the model was not able to generate a valid output. can you tell me which quantised version of llama 3.1 are you using?

@gavinblair
Copy link
Author

gavinblair commented Sep 17, 2024

8B. I'm using Q4_0. I'll try with Q5_K_M once I figure out how to use a different base url.

@nischalj10
Copy link
Member

maybe try 8b-instruct-q4_0, folks in the community have been able to make it work with llama 3.1 8b models

@nischalj10 nischalj10 added the question Further information is requested label Sep 18, 2024
@s-github-2
Copy link

I had the Get RetryError[<Future at 0x182e2357a60 state=finished raised ValidationError>] with ollama issue filed.
The model I was using was llama3:8b. Copiee below is the partial output from ollama serve command ran in terminal

llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = llama
llama_model_loader: - kv 1: general.name str = Meta-Llama-3-8B-Instruct
llama_model_loader: - kv 2: llama.block_count u32 = 32
llama_model_loader: - kv 3: llama.context_length u32 = 8192
llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336
llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8
llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000
llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 10: general.file_type u32 = 2
llama_model_loader: - kv 11: llama.vocab_size u32 = 128256
llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128
llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe
llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", """, "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 17: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 128000
llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 = 128009
llama_model_loader: - kv 20: tokenizer.chat_template str = {% set loop_messages = messages %}{% ...
llama_model_loader: - kv 21: general.quantization_version u32 = 2
llama_model_loader: - type f32: 65 tensors
llama_model_loader: - type q4_0: 225 tensors
llama_model_loader: - type q6_K: 1 tensors

@s-github-2
Copy link

s-github-2 commented Sep 18, 2024

I tried llama3.1:8b-instruct-q4_0 and it gave me the same error
Starting orchestrator
Browser started and ready
Executing command play shape of you on youtube

==================================================

Current State: agentq_base
Agent: sentient
Current Thought:
Plan: none
Completed Tasks: none

==================================================

Error executing the command play shape of you on youtube: RetryError[<Future at 0x21faa7ade40 state=finished raised ValidationError>]

@x676f64
Copy link

x676f64 commented Sep 18, 2024

I'm using Q4_0. I'll try with Q5_K_M once I figure out how to use a different base url.
image

I tried with q5_k_m and got the same result. I got the same result with q4 as well.

@TofailHiary
Copy link

I'm encountering the same issue on Windows 10 with ollama3.1:latest, and I’ve tried other models but faced the same problem. I believe the issue might be related to this code snippet:
class OllamaProvider(LLMProvider): def get_client_config(self) -> Dict[str, str]: return { "api_key": "ollama", "base_url": "http://localhost:11434/v1/", } def get_provider_name(self) -> str: return "ollama"

As far as I understand, Ollama doesn’t require an API key, and the base URL when installed locally should be http://localhost:11434.

dditionally, I encountered an authentication error with the Groq API, which I resolved by modifying the provider.py file as follows:

class GroqProvider(LLMProvider): def get_client_config(self) -> Dict[str, str]: return { "api_key": os.environ.get("GROQ_API_KEY"), "base_url": "https://api.groq.com/openai/v1/", } def get_provider_name(self) -> str: return "groq"

I hope this gets resolved soon. If I find a solution, I’ll let you know.

@TofailHiary
Copy link

any update on this , the issue still not fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants