Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up tgi environment values with the ones used to build the model #529

Merged
merged 11 commits into from
Apr 9, 2024

Conversation

oOraph
Copy link
Contributor

@oOraph oOraph commented Mar 26, 2024

Need this to workaround the model static params, for the docker entrypoint to adapt tgi environment accordingly to the specified model This will make usage of the image easier: default params (e.g not specifying anything) should be enough for most models

@oOraph oOraph marked this pull request as draft March 26, 2024 17:15
@oOraph oOraph force-pushed the dev/tgi_env branch 2 times, most recently from e84cfcc to 07910bd Compare March 29, 2024 14:20
@oOraph oOraph marked this pull request as ready for review March 29, 2024 14:20
@oOraph
Copy link
Contributor Author

oOraph commented Mar 29, 2024

note: the associated generated image is available for testing here :)
[docker.io]/raphael31415/neuronx-tgi:0.0.21.dev0

Copy link
Collaborator

@dacorvo dacorvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but I am a bit worried some configurations might not work.
Could you add integration tests under https://github.com/huggingface/optimum-neuron/tree/main/text-generation-inference/integration-tests ?
I also need to add a github workflow to build the image and run the integration tests (make tgi_docker_test).

text-generation-inference/tgi_env.py Outdated Show resolved Hide resolved
text-generation-inference/tgi_env.py Outdated Show resolved Hide resolved
text-generation-inference/tgi_env.py Outdated Show resolved Hide resolved
text-generation-inference/tgi_env.py Outdated Show resolved Hide resolved
@oOraph
Copy link
Contributor Author

oOraph commented Apr 4, 2024

This looks good to me, but I am a bit worried some configurations might not work. Could you add integration tests under https://github.com/huggingface/optimum-neuron/tree/main/text-generation-inference/integration-tests ? I also need to add a github workflow to build the image and run the integration tests (make tgi_docker_test).

done, both -> tgi_implicit_env.py and workflow

@oOraph oOraph requested a review from dacorvo April 4, 2024 08:42
@oOraph oOraph marked this pull request as draft April 4, 2024 12:19
@oOraph
Copy link
Contributor Author

oOraph commented Apr 4, 2024

Actually I remove the workflow, the integration test test_gpt2.py cannot work for the local_neuron variant, reason:

Some directory is filled with data here:

huggingface_hub.snapshot_download(NEURON_MODEL_ID, local_dir=local_path)

then this directory is expected to be shared with the docker container, here:

the problem is that this cannot work if tests are run within a container + docker dind environment as the volume filled in the first container won't be available on the host and thus won't end up on the second (hence tgi will launch with an empty dir)

-> so either we remove the local_neuron variant or we find a way to share the volume between the container running pytests and the one spawned by pytests

@oOraph oOraph force-pushed the dev/tgi_env branch 4 times, most recently from 7aca4a7 to 5bf0f49 Compare April 8, 2024 10:47
@oOraph oOraph marked this pull request as ready for review April 8, 2024 10:47
@oOraph
Copy link
Contributor Author

oOraph commented Apr 8, 2024

I had to deactivate/remove all tests related to aws-neuron/gpt2-neuronx-bs4-seqlen1024 because of the neuronx-cc upgrade v2.13.xxx

Need this to workaround the model static params, for the docker entrypoint to adapt tgi environment accordingly to the specified model
This will make usage of the image easier: default params (e.g not specifying anything) should be enough for most models

Signed-off-by: Raphael Glon <[email protected]>
Signed-off-by: Raphael Glon <[email protected]>
Signed-off-by: Raphael Glon <[email protected]>
Signed-off-by: Raphael Glon <[email protected]>
Signed-off-by: Raphael Glon <[email protected]>
…del within an image built on the flight

Signed-off-by: Raphael Glon <[email protected]>
Copy link
Collaborator

@dacorvo dacorvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks. Before merging, could you:

  • bump dev version,
  • use a single workflow for TGI,
  • simplify a bit the implicit env test to just check you received a response to a single request.

bump dev version,
use a single workflow for TGI,
simplify a bit the implicit env test

Signed-off-by: Raphael Glon <[email protected]>
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@dacorvo dacorvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the pull-request !

.github/workflows/test_inf2_tgi.yml Outdated Show resolved Hide resolved
@@ -77,6 +80,9 @@ RUN VERBOSE=1 BUILDDIR=/pyserver/build PROTODIR=/pyserver/proto VERSION=${VERSIO
# Neuron base image (used for deployment)
FROM base AS neuron

ARG VERSION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to repeat VERSION multiple times ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep I wasn't even able to fully explain it: without it, on github ci the docker build process failed as you can see here:
https://github.com/huggingface/optimum-neuron/actions/runs/8552046964/job/23432354170
my guess is it's due to a buildkit version... not 100% sure though

Signed-off-by: Raphael Glon <[email protected]>
@oOraph
Copy link
Contributor Author

oOraph commented Apr 9, 2024

Side note: I bumped the version to 0.0.22.dev0. This will temporarily break integration tests (as there are no compatible cached models for the CI yet: gpt 2 compiled with neuronx-cc 2.13.66.0+6dfecc895 on 1 or 2 cores)

@dacorvo dacorvo mentioned this pull request Apr 9, 2024
3 tasks
@dacorvo dacorvo merged commit bb1cc96 into huggingface:main Apr 9, 2024
8 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants