Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Examples] boiler plate code for multi-turn reward for RLHF #2467

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rghosh08
Copy link

@rghosh08 rghosh08 commented Oct 5, 2024

Description

This PR addresses: [Feature Request] multi-turn reward for RLHF #2271

This PR implements the reward system for multi-turn reinforcement learning from human feedback (RLHF), following the guidelines outlined in the paper Multi-turn Reinforcement Learning from Preference Human Feedback. The key changes involve creating a simulated multi-turn dialogue environment where human feedback (rewards) is used to guide policy learning. The implemented policy is trained using policy gradient methods, updating based on human feedback provided at each turn.

Changes include:

  • A multi-turn dialogue environment that simulates five turns of conversation, with human feedback as rewards.
  • A policy network to generate responses in the dialogue based on the current conversation state, where input states are padded/truncated to ensure proper dimensionality.
  • A reward mechanism that simulates human feedback using random choices for rewards (+1 for positive, -1 for negative feedback).
  • A training loop using policy gradients to train the policy network based on discounted cumulative rewards.

Motivation and Context

This change is necessary to replicate the reward structure proposed in the referenced paper, implementing multi-turn RLHF in a way that closely follows the described methodology. It introduces the simulation of human preferences, which plays a key role in the learning process. This change also resolves issue #2271, which proposed adding this reward mechanism to the project.

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.

  • I have read the CONTRIBUTION guide (required)
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.

Copy link

pytorch-bot bot commented Oct 5, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2467

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 5, 2024
@vmoens vmoens added the enhancement New feature or request label Oct 8, 2024
@vmoens vmoens changed the title boiler plate code for multi-turn reward for RLHF [Examples] boiler plate code for multi-turn reward for RLHF Oct 8, 2024
@rghosh08
Copy link
Author

@vmoens I was wondering when this PR could be merged. Please let me know if there is any gap. Thanks

@vmoens
Copy link
Contributor

vmoens commented Oct 30, 2024

Hi @rghosh08
Thanks for working on this.
I appreciate the effort in doing this and it'd be awesome to have some version of this in the lib.

The current script doesn't integrate any component of the library and therefore is of limited value within torchrl.
The RLHF examples we provide are there to show how the libs components are to be used to achieve some specific goal.
I feel the code as it is now would be an odd one out with limited benefit for the user base.
If you're interested I'd invite you to work on editing this for a better integration within the lib!

Thanks again for collaborating

@rghosh08
Copy link
Author

rghosh08 commented Nov 6, 2024

Thanks @vmoens for your feedback. I will come up with an integration. Appreciate your guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] multi-turn reward for RLHF
3 participants