Skip to content

Commit

Permalink
tests still unreliable with Ollama version in GitHub CI
Browse files Browse the repository at this point in the history
These tests should work and do work locally. But they fail in GitHub CI – for an unknown reason that almost certainly is in Ollama, not in our code.
  • Loading branch information
ccreutzi committed Aug 20, 2024
1 parent 08160b7 commit b0023dc
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions tests/tollamaChat.m
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ function extremeTopK(testCase)
%% This should work, and it does on some computers. On others, Ollama
%% receives the parameter, but either Ollama or llama.cpp fails to
%% honor it correctly.
% testCase.assumeTrue(false,"disabled due to Ollama/llama.cpp not honoring parameter reliably");
testCase.assumeTrue(false,"disabled due to Ollama/llama.cpp not honoring parameter reliably");

% setting top-k to k=1 leaves no random choice,
% so we expect to get a fixed response.
Expand All @@ -65,7 +65,7 @@ function extremeMinP(testCase)
%% This should work, and it does on some computers. On others, Ollama
%% receives the parameter, but either Ollama or llama.cpp fails to
%% honor it correctly.
% testCase.assumeTrue(false,"disabled due to Ollama/llama.cpp not honoring parameter reliably");
testCase.assumeTrue(false,"disabled due to Ollama/llama.cpp not honoring parameter reliably");

% setting min-p to p=1 means only tokens with the same logit as
% the most likely one can be chosen, which will almost certainly
Expand All @@ -81,7 +81,7 @@ function extremeTfsZ(testCase)
%% This should work, and it does on some computers. On others, Ollama
%% receives the parameter, but either Ollama or llama.cpp fails to
%% honor it correctly.
% testCase.assumeTrue(false,"disabled due to Ollama/llama.cpp not honoring parameter reliably");
testCase.assumeTrue(false,"disabled due to Ollama/llama.cpp not honoring parameter reliably");

% setting tfs_z to z=0 leaves no random choice, but degrades to
% greedy sampling, so we expect to get a fixed response.
Expand Down

0 comments on commit b0023dc

Please sign in to comment.