Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] fix initialization of ref_llm for full param dpo training with zero-3 #778

Merged
merged 9 commits into from
Jul 19, 2024

Conversation

xu-song
Copy link
Contributor

@xu-song xu-song commented Jun 17, 2024

ref_llm can not be instantiated while full param dpo training with zero-3.

This pull request enable instantiation of ref_llm with the following config.

model = dict(
    type=DPO,
    loss_type=dpo_loss_type,
    use_varlen_attn=use_varlen_attn,
    beta=loss_beta,
    label_smoothing=label_smoothing,
    llm=dict(
        type=AutoModelForCausalLM.from_pretrained,
        pretrained_model_name_or_path=pretrained_model_name_or_path,
        trust_remote_code=True,
        torch_dtype=torch.float16,
        ),
    ref_llm=dict(  ##### initialization of ref_llm #######
        type=AutoModelForCausalLM.from_pretrained,
        pretrained_model_name_or_path=pretrained_model_name_or_path,
        trust_remote_code=True,
        torch_dtype=torch.float16,
        ),
    )

@xu-song xu-song changed the title [Fix] fix initialization of ref_llm [Fix] fix initialization of ref_llm for full param dpo training with zero-3 Jun 18, 2024
@RangiLyu
Copy link
Contributor

Thank you for your contribution! It appears that this fix might not fully address the issue when using var_len_attention. I noticed another solution in PR #781, which seems to be a more comprehensive approach.

@xu-song
Copy link
Contributor Author

xu-song commented Jun 18, 2024

@RangiLyu I checked the PR you mentioned above. However it's not a good idea to create another SupervisedFinetune instance. It has some irrelevant operations.
Following your advice, var_len_attention has been supported in this pr.

@pppppM
Copy link
Collaborator

pppppM commented Jul 9, 2024

@xu-song Great PR! There are conflicts that need to be resolved.

@xu-song
Copy link
Contributor Author

xu-song commented Jul 9, 2024

conflicts have been resolved

@HIT-cwh HIT-cwh merged commit ba7afc7 into InternLM:main Jul 19, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants