Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

有个疑问,计算Loss的时候并不是以reward_token_id最终loss计算的,为什么推理的时候可以以reward_token_id为准呢? #921

Open
woshixiaobai2019 opened this issue Sep 1, 2024 · 6 comments

Comments

@woshixiaobai2019
Copy link

def compute_loss(self, data, labels=None):

@tcxia
Copy link

tcxia commented Sep 2, 2024

您好,请问下您这边用什么推理的呢?

@woshixiaobai2019
Copy link
Author

抱歉,解决了,仔细读了一遍源码,没有问题

@tcxia
Copy link

tcxia commented Sep 2, 2024

@woshixiaobai2019 大佬,我其实想问下您这边如何推理的,我这边推理一直报错

@woshixiaobai2019
Copy link
Author

@woshixiaobai2019 大佬,我其实想问下您这边如何推理的,我这边推理一直报错

模仿modelling_interml里面的reward model推理

@tcxia
Copy link

tcxia commented Sep 2, 2024

@woshixiaobai2019 能给个完整路径参考吗?非常感谢~

@woshixiaobai2019
Copy link
Author

@woshixiaobai2019 能给个完整路径参考吗?非常感谢~

https://huggingface.co/internlm/internlm2-1_8b-reward/blob/main/modeling_internlm2.py

这里reward model的forward函数

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants