Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream() does not obey the roles function #943

Open
peterwilli opened this issue Jul 7, 2024 · 2 comments
Open

Stream() does not obey the roles function #943

peterwilli opened this issue Jul 7, 2024 · 2 comments

Comments

@peterwilli
Copy link

peterwilli commented Jul 7, 2024

The bug

When using roles such as with assistant(): combined with stream(), roles are not obeyed and all output is being squished into the last role...

To Reproduce
Give a full working code snippet that can be pasted into a notebook cell or python file. Make sure to include the LLM load step so we know which model you are using.

from guidance import models, gen
from guidance import user, system, assistant

model = models.LlamaCpp(
    "./models/any_model.gguf",
    n_gpu_layers=-1,
    temperature=0,
    n_ctx=8192
)

lm = model.stream()
with user():
    lm += "Hey there!"

with assistant():
    lm += gen()
    for token in lm:
        print(token)

What I get:

<|start_header_id|>assistant<|end_header_id>

Hey there! How's it going?<|eot_id|>
<|start_header_id|>assistant<|end_header_id>

Hey there! How's it going? What's on your mind? Do you have<|eot_id|>
<|start_header_id|>assistant<|end_header_id>

Hey there! How's it going? What's on your mind? Do you have any questions or topics you'd like to discuss<|eot_id|>
<|start_header_id|>assistant<|end_header_id>

Hey there! How's it going? What's on your mind? Do you have any questions or topics you'd like to discuss? I'm here to help and provide information<|eot_id|>
<|start_header_id|>assistant<|end_header_id>

Note that the output is all under assistant, and that the sentence being written is a completion of what should have been the user() role.

System info (please complete the following information):

  • OS (e.g. Ubuntu, Windows 11, Mac OS, etc.): MacOS
  • Guidance Version (guidance.__version__): 0.1.15

(Temporary) workaround

I found a way around this after reading the source code and attempting to fix it (I couldn't find a way to do it without breaking too much things, so I wait for a dev with more experience for a final fix) - it's not the cleanest but it'll do for now:

from guidance import models, gen

model = models.LlamaCpp(
    "./models/any_model.gguf",
    n_gpu_layers=-1,
    temperature=0,
    n_ctx=8192
)

def wrap_role_start(lm):
    for block in models.LlamaCpp.open_blocks.keys():
        lm += block.opener
    return lm

def wrap_role_end(lm):
    for block in models.LlamaCpp.open_blocks.keys():
        lm += block.closer
    return lm

lm = model.stream()
with user():
    lm = wrap_role_start(lm)
    lm += "Hey there!"
    lm = wrap_role_end(lm)

with assistant():
    lm = wrap_role_start(lm)
    lm += gen()
    lm = wrap_role_end(lm)
for token in lm:
    print(token)

Note that the token iteration is outside of any roles, this is important for the workaround to work.

@hudson-ai
Copy link
Collaborator

@peterwilli thank you for identifying this issue and providing a workaround! It would be nice if we could do this a bit more "automatically" for our users -- I think that the context-manager implementation needs some attention... will continue thinking about this :)

@nking-1 tagging you for interest

@peterwilli
Copy link
Author

peterwilli commented Jul 31, 2024

Thank you! Yeah! That was exactly where my thoughts went, but I wasn't experienced enough with your source code to make such change to the context manager. I decided to settle for this workaround for now, but of course automating this, so behavior is the same for stream and not stream is much better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants