Replies: 8 comments 1 reply
-
Beta Was this translation helpful? Give feedback.
-
大佬慢点,我先入个门😭 |
Beta Was this translation helpful? Give feedback.
-
我认领这篇:Aligning Language Models with Offline Reinforcement Learning from Human Feedback,研究研究,到时候分享一个ppt |
Beta Was this translation helpful? Give feedback.
-
我认领这篇:DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales |
Beta Was this translation helpful? Give feedback.
-
我认领openAI这篇吧:《Training language models to follow instructions with human feedback》,到时候一起ppt圆桌分享哇。 |
Beta Was this translation helpful? Give feedback.
-
我认领这篇:Improving alignment of dialogue agents via targeted human judgements |
Beta Was this translation helpful? Give feedback.
-
考虑支持视觉的RLHF-V吗? |
Beta Was this translation helpful? Give feedback.
-
Discuss the Implementation of RLHF
Beta Was this translation helpful? Give feedback.
All reactions