PPOTrainer for models #398
Replies: 1 comment 1 reply
-
H2O LLM Studio uses a modified version of the trl trainer in the background when using the RLHF problem type h2o-llmstudio/llm_studio/src/trl/trainer.py Lines 1 to 18 in bf667d0 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Does llmstudio train using the PPOTrainer from the trl package, or it can be done afterwards somehow?
Beta Was this translation helpful? Give feedback.
All reactions