Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_examples uses ExampleRunner #227

Merged
merged 14 commits into from
Sep 14, 2023
Merged

test_examples uses ExampleRunner #227

merged 14 commits into from
Sep 14, 2023

Conversation

michaelbenayoun
Copy link
Member

@michaelbenayoun michaelbenayoun commented Sep 12, 2023

  • Updates ExampleRunner to make it both usable for Neuron cache filling and testing examples
  • Updates the tests/test_examples.py file
    • It is possible to run "tiny" versions of the models by setting the following flag RUN_TINY=1. It will run the example with the same model except that it contains much less blocks.
    • Each model, for a given task, is tested on different sharding strategies: no sharding, ZeRO-1, TP, TP + ZeRO-1 (when it makes sense)
    • Each model has a defined "coverage" associated to it: low, middle, high. By setting the flag COVERAGE it is possible to run test a subset of the models. The coverage of a model is defined by its uniqueness. Ideally running tests/test_examples.py with COVERAGE=high should cover most of the use cases.
  • Checks that the training loss is decreasing
  • Test example workflow is now triggered manually

In following PRs:

  • Validation is fixed and performed, each evaluation score is compared to a reference value, either hardcoded manually or computed by running a similar training job with Transformers on GPU.
  • Enable ZeRO-1 tests once Parallel cross entropy #222 is merged.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@michaelbenayoun michaelbenayoun changed the title test_examples use ExampleRunner test_examples uses ExampleRunner Sep 14, 2023
optimum/neuron/utils/runner.py Outdated Show resolved Hide resolved
tests/test_examples.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!! Would be great if we can have all green CIs from now!

tests/test_examples.py Outdated Show resolved Hide resolved
tests/test_examples.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@dacorvo dacorvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@michaelbenayoun michaelbenayoun merged commit 0cb9880 into main Sep 14, 2023
6 of 13 checks passed
@michaelbenayoun michaelbenayoun deleted the update_test_examples branch September 14, 2023 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants