Stream() does not obey the roles function #943

peterwilli · 2024-07-07T09:57:10Z

The bug

When using roles such as with assistant(): combined with stream(), roles are not obeyed and all output is being squished into the last role...

To Reproduce
Give a full working code snippet that can be pasted into a notebook cell or python file. Make sure to include the LLM load step so we know which model you are using.

from guidance import models, gen
from guidance import user, system, assistant

model = models.LlamaCpp(
    "./models/any_model.gguf",
    n_gpu_layers=-1,
    temperature=0,
    n_ctx=8192
)

lm = model.stream()
with user():
    lm += "Hey there!"

with assistant():
    lm += gen()
    for token in lm:
        print(token)

What I get:

<|start_header_id|>assistant<|end_header_id>

Hey there! How's it going?<|eot_id|>
<|start_header_id|>assistant<|end_header_id>

Hey there! How's it going? What's on your mind? Do you have<|eot_id|>
<|start_header_id|>assistant<|end_header_id>

Hey there! How's it going? What's on your mind? Do you have any questions or topics you'd like to discuss<|eot_id|>
<|start_header_id|>assistant<|end_header_id>

Hey there! How's it going? What's on your mind? Do you have any questions or topics you'd like to discuss? I'm here to help and provide information<|eot_id|>
<|start_header_id|>assistant<|end_header_id>

Note that the output is all under assistant, and that the sentence being written is a completion of what should have been the user() role.

System info (please complete the following information):

OS (e.g. Ubuntu, Windows 11, Mac OS, etc.): MacOS
Guidance Version (guidance.__version__): 0.1.15

(Temporary) workaround

I found a way around this after reading the source code and attempting to fix it (I couldn't find a way to do it without breaking too much things, so I wait for a dev with more experience for a final fix) - it's not the cleanest but it'll do for now:

from guidance import models, gen

model = models.LlamaCpp(
    "./models/any_model.gguf",
    n_gpu_layers=-1,
    temperature=0,
    n_ctx=8192
)

def wrap_role_start(lm):
    for block in models.LlamaCpp.open_blocks.keys():
        lm += block.opener
    return lm

def wrap_role_end(lm):
    for block in models.LlamaCpp.open_blocks.keys():
        lm += block.closer
    return lm

lm = model.stream()
with user():
    lm = wrap_role_start(lm)
    lm += "Hey there!"
    lm = wrap_role_end(lm)

with assistant():
    lm = wrap_role_start(lm)
    lm += gen()
    lm = wrap_role_end(lm)
for token in lm:
    print(token)

Note that the token iteration is outside of any roles, this is important for the workaround to work.

The text was updated successfully, but these errors were encountered:

hudson-ai · 2024-07-09T16:17:13Z

@peterwilli thank you for identifying this issue and providing a workaround! It would be nice if we could do this a bit more "automatically" for our users -- I think that the context-manager implementation needs some attention... will continue thinking about this :)

@nking-1 tagging you for interest

peterwilli · 2024-07-31T08:18:56Z

Thank you! Yeah! That was exactly where my thoughts went, but I wasn't experienced enough with your source code to make such change to the context manager. I decided to settle for this workaround for now, but of course automating this, so behavior is the same for stream and not stream is much better

peterwilli mentioned this issue Jul 7, 2024

ModelStream should be a drop-in replacement for Model or their API differences should be documented #939

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream() does not obey the roles function #943

Stream() does not obey the roles function #943

peterwilli commented Jul 7, 2024 •

edited

Loading

hudson-ai commented Jul 9, 2024

peterwilli commented Jul 31, 2024 •

edited

Loading

Stream() does not obey the roles function #943

Stream() does not obey the roles function #943

Comments

peterwilli commented Jul 7, 2024 • edited Loading

(Temporary) workaround

hudson-ai commented Jul 9, 2024

peterwilli commented Jul 31, 2024 • edited Loading

peterwilli commented Jul 7, 2024 •

edited

Loading

peterwilli commented Jul 31, 2024 •

edited

Loading