PPOTrainer for models #398

JarcauCristian · 2023-08-30T08:06:55Z

JarcauCristian
Aug 30, 2023

Does llmstudio train using the PPOTrainer from the trl package, or it can be done afterwards somehow?

pascal-pfeiffer · 2023-09-13T15:35:20Z

pascal-pfeiffer
Sep 13, 2023
Maintainer

H2O LLM Studio uses a modified version of the trl trainer in the background when using the RLHF problem type

h2o-llmstudio/llm_studio/src/trl/trainer.py

Lines 1 to 18 in bf667d0

    
           # This file borrows large pieces from the trl library, which is licensed under 
        
           # the Apache 2.0 license. 
        
           # https://github.com/lvwerra/trl/blob/main/trl/trainer/ppo_trainer.py 
        
           # Copyright 2022 The HuggingFace Team. All rights reserved. 
        
           # 
        
           # Licensed under the Apache License, Version 2.0 (the "License"); 
        
           # you may not use this file except in compliance with the License. 
        
           # You may obtain a copy of the License at 
        
           # 
        
           #     http://www.apache.org/licenses/LICENSE-2.0 
        
           # 
        
           # Unless required by applicable law or agreed to in writing, software 
        
           # distributed under the License is distributed on an "AS IS" BASIS, 
        
           # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 
        
           # See the License for the specific language governing permissions and 
        
           # limitations under the License.

1 reply

JarcauCristian Sep 13, 2023
Author

Alright! Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPOTrainer for models #398

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

PPOTrainer for models #398

JarcauCristian Aug 30, 2023

Replies: 1 comment · 1 reply

pascal-pfeiffer Sep 13, 2023 Maintainer

JarcauCristian Sep 13, 2023 Author

JarcauCristian
Aug 30, 2023

Replies: 1 comment 1 reply

pascal-pfeiffer
Sep 13, 2023
Maintainer

JarcauCristian Sep 13, 2023
Author