FEATURE: improve tool support #904

SamSaffron · 2024-11-08T03:59:10Z

This re-implements tool support in DiscourseAi::Completions::Llm #generate

Previously tool support was always returned via XML and it would be the responsibility of the caller to parse XML

New implementation has the endpoints return ToolCall objects.

Additionally this simplifies the Llm endpoint interface and gives it more clarity. Llms must implement

decode, decode_chunk (for streaming)

It is the implementers responsibility to figure out how to decode chunks, base no longer implements. To make this easy we ship a flexible json decoder which is easy to wire up.

Also (new)

Better debugging for PMs, we now have a next / previous button to see all the Llm messages associated with a PM
Token accounting is fixed for vllm (we were not correctly counting tokens)

This work in progress PR amends llm completion so it returns objects for tools vs XML fragments This will empower future features such as parameter streaming XML was error prone, object implementation is more robust Still very much in progress, a lot of code needs to change Partially implemented on Anthropic at the moment.

SamSaffron · 2024-11-11T06:26:47Z

Notable compromise:

.generate will return either an Array (for tool call + completion) or a single element for a single element array.

This does place some responsibility on caller who may get differently shaped data.

We could always return an array but it will make it more complex to consume for cases where you are not using tools.

SamSaffron added 20 commits November 8, 2024 14:58

fix logging

adc6c6e

xml tools are back now

56f54b5

more code removal

fc8d128

remove unused files and move spec

6cac18a

Open AI starting to work.

9f8c15f

more tests passing

21ddf4e

more specs working more refactoring

8fa7fb3

move code to class

540b6a7

Gemini support for new interface

c5d1b7b

cohere implementation

0cc4144

lint

dfc083f

hugging face

6f66f4b

lint

4075260

sambanova and vllm

3f93c9a

account for tokens properly with non streaming calls

0ab073a

properly track vllm usage

b4fb085

Ollama working

e8c673f

fix specs

6843eff

Lint and fix all specs

23f1f25

SamSaffron marked this pull request as ready for review November 11, 2024 06:20

SamSaffron added 2 commits November 11, 2024 17:44

give ollama vision cause it is easy enough.

c727ae7

remove unused code

dbbbbfc

romanrizzi approved these changes Nov 11, 2024

View reviewed changes

SamSaffron merged commit e817b7d into main Nov 11, 2024
6 checks passed

SamSaffron deleted the tool-us-no-xml branch November 11, 2024 21:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEATURE: improve tool support #904

FEATURE: improve tool support #904

SamSaffron commented Nov 8, 2024 •

edited

Loading

SamSaffron commented Nov 11, 2024

FEATURE: improve tool support #904

FEATURE: improve tool support #904

Conversation

SamSaffron commented Nov 8, 2024 • edited Loading

SamSaffron commented Nov 11, 2024

SamSaffron commented Nov 8, 2024 •

edited

Loading