Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ORT training to be compatible with transformers 4.31 #1227

Merged
merged 4 commits into from
Aug 1, 2023

Conversation

JingyaHuang
Copy link
Collaborator

@JingyaHuang JingyaHuang commented Jul 25, 2023

What does this PR do?

As per title.

  • ORTTrainer
  • ORTTrainingArguments
  • ORTSeq2SeqTrainer
  • ORTSeq2SeqTrainingArguments

Maybe in the next PRs

These PRs are for the inference (evaluation / prediction) of trainer APIs. Given that the Trainer APIs are mainly used for training, and the ORT inference can always be done with ORTModel(s) in the optimum library, there is no priority nor eta for the following PRs. If anyone is interested in contributing, please feel free to open a PR and tag me for review. I am willing to handle them, but I have no vision of when I can have the bandwidth for them...

  • Improve the export of models
  • Support merged decoder
  • Register more tasks supported by Optimum right now

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 26, 2023

The documentation is not available anymore as the PR was closed or merged.

@prathikr
Copy link
Contributor

Related Issue: #1133

@JingyaHuang
Copy link
Collaborator Author

Gently pinging @pacman100 for the context.

Current issue I met with the optimizer while using deepspeed zero stage 2

----------------------------------------------------------------------
Traceback (most recent call last):
  File "/workspace/optimum/test_onnxruntime_train.py", line 135, in test_ort_trainer_encoder
    train_result = trainer.train()
  File "/workspace/optimum/optimum/onnxruntime/trainer.py", line 455, in train
    return inner_training_loop(
  File "/workspace/optimum/optimum/onnxruntime/trainer.py", line 815, in _inner_training_loop
    self.optimizer.step()
AttributeError: 'DummyOptim' object has no attribute 'step'

----------------------------------------------------------------------

@prathikr
Copy link
Contributor

@JingyaHuang any updates?

@kshama-msft
Copy link

@JingyaHuang @pacman100 we are waiting for this issue to help unblock integrations within our team. It would be great if this could be fasttracked. Thanks!

@JingyaHuang
Copy link
Collaborator Author

Thanks @pacman100 for helping out!

So @prathikr, @kshama-msft, with the help of @pacman100 from the accelerate team, we are having ORTTrainer compatible with transformers 4.31 and accelerate 0.10. Can you review and also try out my branch to ensure the fix? Thanks!

@prathikr
Copy link
Contributor

prathikr commented Aug 1, 2023

Thank you @JingyaHuang this PR resolved my issue. Please merge ASAP.

@JingyaHuang JingyaHuang merged commit 5730bd2 into main Aug 1, 2023
62 of 66 checks passed
@JingyaHuang JingyaHuang deleted the update-ort-trainer-431 branch August 1, 2023 20:49
@kshama-msft
Copy link

Thanks Jingya for the prompt fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants