Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI assistants to determine if the context of the conversation #787

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Hugo-Calero
Copy link

@Hugo-Calero Hugo-Calero commented May 3, 2024

If your PR is related to a contribution to the taxonomy, please, fill
out the following questionnaire. If not, replace this whole text and the
following questionnaire with whatever information is applicable to your PR.

Describe the contribution to the taxonomy

  • This contribution may help to identify if a user is changing topic in a multiturn conversation.
  • This is helpful in the context of building AI assistants, to decide when to keep context of the conversation or not.
  • ...

Input given at the prompt

   A list of previous user queries, and the new query to compare.

Response from the original model

  ...

Response from the fine-tuned model

  ...

Contribution checklist

  • The contribution was tested with ilab generate
  • No errors or warnings were produced by ilab generate
  • All commits are signed off (DCO)
  • The qna.yaml file contains at least 5 seed_examples
  • The qna.yaml file was linted and prettified (yaml-validator can do both)
  • An attribution.txt file in the same folder as the qna.yaml file
  • Content does not include PII or otherwise sensitive or confidential information
  • Content does not include anything documented in the project's Avoid these Topics guidelines

Signed-off-by: Hugo Carlos Calero Díaz <[email protected]>
Signed-off-by: Hugo Carlos Calero Díaz <[email protected]>
Signed-off-by: Hugo Carlos Calero Díaz <[email protected]>
@github-actions github-actions bot added triage-needed (Auto labeled) skill is ready to be triaged skill (Auto labeled) labels May 3, 2024
Copy link

Beep, boop 🤖, Hi, I'm @instructlab-bot and I'm going to help you with your pull request. Thanks for you contribution! 🎉

I support the following commands:

  • @instructlab-bot precheck -- Check existing model behavior using the questions in this proposed change.
  • @instructlab-bot generate -- Generate a sample of synthetic data using the synthetic data generation backend infrastructure.
  • @instructlab-bot generate-local -- Generate a sample of synthetic data using a local model.
  • @instructlab-bot help -- Print this help message again.

Note

Results or Errors of these commands will be posted as a pull request check in the Checks section below

Note

Currently only maintainers belongs to [[taxonomy-triagers taxonomy-approvers taxonomy-maintainers labrador-org-maintainers instruct-lab-bot-maintainers]] teams are allowed to run these commands.

Signed-off-by: Hugo Carlos Calero Díaz <[email protected]>
@mingxzhao
Copy link
Member

You will also need to sign off on your commits as outlined here

@mingxzhao mingxzhao added triage-requested-changes skill has been reviewed; changes requested from contributor and removed triage-needed (Auto labeled) skill is ready to be triaged labels May 9, 2024
Copy link

Beep, boop 🤖, Hi, I'm @instructlab-bot and I'm going to help you with your pull request. Thanks for you contribution! 🎉

I support the following commands:

  • @instructlab-bot precheck -- Check existing model behavior using the questions in this proposed change.
  • @instructlab-bot generate -- Generate a sample of synthetic data using the synthetic data generation backend infrastructure.
  • @instructlab-bot generate-local -- Generate a sample of synthetic data using a local model.
  • @instructlab-bot help -- Print this help message again.

Note

Results or Errors of these commands will be posted as a pull request check in the Checks section below

Note

Currently only maintainers belongs to [[taxonomy-triagers taxonomy-approvers taxonomy-maintainers labrador-org-maintainers instruct-lab-bot-maintainers]] teams are allowed to run these commands.

@mingxzhao
Copy link
Member

@instructlab-bot precheck

Copy link

Beep, boop 🤖, Generating test data for your PR with the job type: precheck. Your Job ID is 277. The results will be presented below in the pull request status box. This may take several minutes...

Copy link

Results for job ID: 277 using the model merlinite-7b!

Results can be found here.

@Hugo-Calero
Copy link
Author

Hello @mingxzhao I can see the commits are signed off already. Eg: "Signed-off-by: Hugo Carlos Calero Díaz [email protected]". Is there anything else I need to do?

@mingxzhao
Copy link
Member

Ah apologies I seemed to have missed that. If you could just update the attribution file and fix the linting issues, I can approve! It seems there are some spacing issues in your file.

@Hugo-Calero
Copy link
Author

Hello! I have checked the listing in https://www.yamllint.com, and I see no error, is there any specific guideline I need to follow or recommended software to check?
Also, what is missing in the attribution file?
Thanks

Copy link
Member

@jjasghar jjasghar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please resolve the linting issues.

Signed-off-by: Hugo Carlos Calero Díaz <[email protected]>
@github-actions github-actions bot added the triage-needed (Auto labeled) skill is ready to be triaged label May 16, 2024
Copy link

Beep, boop 🤖, Hi, I'm @instructlab-bot and I'm going to help you with your pull request. Thanks for you contribution! 🎉

I support the following commands:

  • @instructlab-bot precheck -- Check existing model behavior using the questions in this proposed change.
  • @instructlab-bot generate -- Generate a sample of synthetic data using the synthetic data generation backend infrastructure.
  • @instructlab-bot generate-local -- Generate a sample of synthetic data using a local model.
  • @instructlab-bot help -- Print this help message again.

Note

Results or Errors of these commands will be posted as a pull request check in the Checks section below

Note

Currently only maintainers belongs to [[taxonomy-triagers taxonomy-approvers taxonomy-maintainers labrador-org-maintainers instruct-lab-bot-maintainers]] teams are allowed to run these commands.

@Hugo-Calero
Copy link
Author

Hi, I commited modifications to the yaml file, I hope there are no spacing issues now. Please review! Many thanks :)

Copy link
Member

@jjasghar jjasghar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the linting issue.

@jjasghar jjasghar removed the triage-needed (Auto labeled) skill is ready to be triaged label May 17, 2024
Signed-off-by: Hugo Carlos Calero Díaz <[email protected]>
@github-actions github-actions bot added the triage-needed (Auto labeled) skill is ready to be triaged label May 18, 2024
Copy link

Beep, boop 🤖, Hi, I'm @instructlab-bot and I'm going to help you with your pull request. Thanks for you contribution! 🎉

I support the following commands:

  • @instructlab-bot precheck -- Check existing model behavior using the questions in this proposed change.
  • @instructlab-bot generate -- Generate a sample of synthetic data using the synthetic data generation backend infrastructure.
  • @instructlab-bot generate-local -- Generate a sample of synthetic data using a local model.
  • @instructlab-bot help -- Print this help message again.

Note

Results or Errors of these commands will be posted as a pull request check in the Checks section below

Note

Currently only maintainers belongs to [[taxonomy-triagers taxonomy-approvers taxonomy-maintainers labrador-org-maintainers instruct-lab-bot-maintainers]] teams are allowed to run these commands.

@jjasghar jjasghar changed the title Add skill AI assistants to determine if the context of the conversation May 20, 2024
@jjasghar
Copy link
Member

@instructlab-bot precheck

Copy link

Beep, boop 🤖, Generating test data for your PR with the job type: precheck. Your Job ID is 324. The results will be presented below in the pull request status box. This may take several minutes...

@jjasghar jjasghar added the precheck-generate-ready PR is ready for precheck or generate step label May 20, 2024
Copy link

Results for job ID: 324 using the model instructlab/granite-7b-lab!

Results can be found here.

@Hugo-Calero
Copy link
Author

Hello, I see the merging is still blocked. Is there anything left to do on my side? I already fixed the linting issues I identified, I hope I didn't miss any.
Thanks

@jjasghar
Copy link
Member

@instructlab-bot generate

Copy link

Beep, boop 🤖, Generating test data for your PR with the job type: sdg-svc. Your Job ID is 346. The results will be presented below in the pull request status box. This may take several minutes...

@jjasghar
Copy link
Member

Looking at the precheck it seems to be very far off the rails which is good for this PR. Now the SDG generate will give us an understanding of the data that is generated and if it will add possible value to the model.

Assuming it does, then we will tag it as approved, and it will be upstreamed, then we will see from the engineering team if the model improves, only then we will merge.

Copy link

Results for job ID: 346 using the model sdg service backend!

Results can be found here.

Copy link
Member

@jjasghar jjasghar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the version: 2 then we are ready the next steps.

@jjasghar jjasghar added the triage-requested-changes skill has been reviewed; changes requested from contributor label May 28, 2024
Signed-off-by: Hugo Carlos Calero Díaz <[email protected]>
@github-actions github-actions bot added the triage-needed (Auto labeled) skill is ready to be triaged label Jun 2, 2024
Copy link

Beep, boop 🤖, Hi, I'm @instructlab-bot and I'm going to help you with your pull request. Thanks for you contribution! 🎉

I support the following commands:

  • @instructlab-bot precheck -- Check existing model behavior using the questions in this proposed change.
  • @instructlab-bot generate -- Generate a sample of synthetic data using the synthetic data generation backend infrastructure.
  • @instructlab-bot generate-local -- Generate a sample of synthetic data using a local model.
  • @instructlab-bot help -- Print this help message again.

Note

Results or Errors of these commands will be posted as a pull request check in the Checks section below

Note

Currently only maintainers belongs to [[taxonomy-triagers taxonomy-approvers taxonomy-maintainers labrador-org-maintainers instruct-lab-bot-maintainers]] teams are allowed to run these commands.

@jjasghar jjasghar removed triage-needed (Auto labeled) skill is ready to be triaged triage-requested-changes skill has been reviewed; changes requested from contributor labels Jun 3, 2024
@jjasghar
Copy link
Member

jjasghar commented Jun 3, 2024

@instructlab-bot precheck

Copy link

Beep, boop 🤖, Generating test data for your PR with the job type: precheck. Your Job ID is 362. The results will be presented below in the pull request status box. This may take several minutes...

Copy link

Results for job ID: 362 using the model instructlab/granite-7b-lab!

Results can be found here.

@jjasghar
Copy link
Member

jjasghar commented Jun 3, 2024

With what I read through the pre-check the model already seems to do this quite well. @mingxzhao can you do a sanity check for me?
I think we should reject this as something the model already does.

@mingxzhao
Copy link
Member

It does seem to get several of the answers wrong when compared to the user provided answers. I think this could be a good skill, but the context part of the question may need to be placed in the "context" field of the yaml for good SDG. At the moment there does seem to be several wrong answers though.

@jjasghar
Copy link
Member

jjasghar commented Jun 3, 2024

Wait really? I thought the precheck answer what we wanted. That it knew the questions didn't match the context, for instance: https://instruct-lab-bot.s3.us-east-2.amazonaws.com/precheck-pr-787-1dee610dbcdeb1f9cc76357c360aa8e63ebd1e8c-job-362/chat_2024-06-03T21_21_14.log

While the questions you've provided are related to the topic of food and dining
in Madrid, the question about the Retiro park is unrelated to the previous
conversation.

@mingxzhao
Copy link
Member

Not quite, the question is about whether the following provided question is in the same context as the previous provided questions. The model seems to have difficulty parsing this and even answers the question separately from the context entirely.

@Hugo-Calero
Copy link
Author

Hi, is there anything I can do to advance the status of this contribution?

@jjasghar
Copy link
Member

jjasghar commented Jun 6, 2024

We are in progress here. I believe we need this to be put in the next run, updates will be added to the PR as they arrive.

@jjasghar jjasghar added community-build-ready Triage Team has signed off for synthetic data generation and removed precheck-generate-ready PR is ready for precheck or generate step labels Jun 7, 2024
@jjasghar
Copy link
Member

Hi! 👋
It’s been a while since you’ve seen any movement on this PR. We haven’t forgotten about you!  We’ve run into some logistical issues, hence this delay. We absolutely want your PR, and being marketed as e2e-ready is still the last stop before we get it into the upstream model.

We are thankful for your patience and ask that you please keep this PR open. As soon as we finish all our behind-the-scenes work, we’ll test the full model against your submissions and, ideally, accept your amazing contribution(s)! 

Your Community Maintainer Team.

P.S. if you have any specific questions or thoughts, don’t hesitate to comment on pull request this or email [email protected] and [email protected], and we’ll get back to you as soon as possible.

@jjasghar
Copy link
Member

@mcorbin-ibm where should this one live?

@mcorbin-ibm
Copy link
Contributor

I think that this applies more to the "AI" side of things than "linguistics" and after doing some research in Wikipedia, in conjunction with our Dewey Decimal System reference document, I recommend placing this here in the taxonomy:

compositional_skills/technology/computer_science/ai/nlp/conversation_orchestration

Note: I did not see a context specified in the qna, so I did not put this under the compositional_skills/grounded directory, but if the qna requires a context, it should be moved there instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-build-ready Triage Team has signed off for synthetic data generation skill (Auto labeled)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants