-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
about <u_quote></u_quote> #2
Comments
We do normalise the text by stripping punctuation and making lowercase. This catches many of the issues where the LLM might not give the exact quote. However, you're right that sometimes the small differences that are meaningless will make the quote get marked as unverified and this does affect the final judgement. We decided that we're ok with this, since as debaters get stronger (e.g. better base model or higher BoN) they should be incentivised to use more accurate quotes. Since BoN uses the judge prompt, if the debater use slightly wrong quotes then the judge won't be as persuaded by that argument. So BoN will incentivise verified quotes. |
Thanks for your kind explanation!
|
If you look at our "expert" baseline in Figure 1, that shows the accuracy if the judge is provided with the story. It is of course higher, but we're using this toy setting with information asymmetry to understand how a non-expert could supervise stronger models in the future when we don't have ground truth labels. In that setting, you wouldn't be able to calculate an accuracy on your task. So then the question is how do we supervise such a system to get better with humans. And debate is one way that could be possible.
We provide background information such as the fact they are in a debate setting, they will see arguments from two debaters and they are arguing over a question with two answer choices. Providing context here does help the judge accuracy, and also explaining the quoting system helps too. We have prompt ablations in the Appendix if you're interested in other prompt engineering we did. If by background information you mean the story, then see my answer to your first question. |
Thanks a lot. I am now considering applying your system in court debates, where 'stories' would be understood as various legal statues. Then, the prosecutor and the lawyer would debate, so logically, the judge would also understand these 'legal statues (stories)'. Therefore, I am considering that the judge should be able to see both the 'debate transcript' and the 'stories'.
So, what you meant earlier is that the expert in Figure 1 means the judge see both the transcript and stories (Previously, I understood that the judge only saw the stories, without the context of the debate ) |
That is cool! Keep us in the loop. Expert in figure 1 is just the judge seeing the story. It doesn't see the debate. We don't run the baseline of story + debates, which is what you want to run. I would expect that this won't help accuracy since the debates will confuse the LLM more compared to just working it out via the story (see Appendix C on our write up on self-improvement with debates). Remember that our primary motivation for our work is scalable oversight so in the future the judge won't get access to the information / capability of the expert models that are debating. But for your use-case (if you don't care about scalable oversight) it seems interesting to try. |
Of course and thanks again ! |
Hi
I have some issues about the tag <u_quote></u_quote>. You mentioned in your prompt that tag will transfer to <u_quote> </u_quote> when the quotes don't pass verification through direct string matching.
Sometimes, LLMs (like GPT4) may select quotes from stories that do not exactly match the original strings, but their meanings are consistent. In such cases, would directly tagging these strings with an 'u_quote' tag affect the final judgment of the judge?
The text was updated successfully, but these errors were encountered: