Set up tgi environment values with the ones used to build the model #529

oOraph · 2024-03-26T17:13:53Z

Need this to workaround the model static params, for the docker entrypoint to adapt tgi environment accordingly to the specified model This will make usage of the image easier: default params (e.g not specifying anything) should be enough for most models

oOraph · 2024-03-29T14:58:24Z

note: the associated generated image is available for testing here :)
[docker.io]/raphael31415/neuronx-tgi:0.0.21.dev0

dacorvo

This looks good to me, but I am a bit worried some configurations might not work.
Could you add integration tests under https://github.com/huggingface/optimum-neuron/tree/main/text-generation-inference/integration-tests ?
I also need to add a github workflow to build the image and run the integration tests (make tgi_docker_test).

text-generation-inference/tgi_env.py

oOraph · 2024-04-04T08:41:51Z

This looks good to me, but I am a bit worried some configurations might not work. Could you add integration tests under https://github.com/huggingface/optimum-neuron/tree/main/text-generation-inference/integration-tests ? I also need to add a github workflow to build the image and run the integration tests (make tgi_docker_test).

done, both -> tgi_implicit_env.py and workflow

oOraph · 2024-04-04T15:43:52Z

Actually I remove the workflow, the integration test test_gpt2.py cannot work for the local_neuron variant, reason:

Some directory is filled with data here:

optimum-neuron/text-generation-inference/integration-tests/test_gpt2.py

Line 27 in 6856557

huggingface_hub.snapshot_download(NEURON_MODEL_ID, local_dir=local_path)

then this directory is expected to be shared with the docker container, here:

optimum-neuron/text-generation-inference/integration-tests/conftest.py

Line 115 in 6856557

volumes = [f"{data_volume}:/data"]

the problem is that this cannot work if tests are run within a container + docker dind environment as the volume filled in the first container won't be available on the host and thus won't end up on the second (hence tgi will launch with an empty dir)

-> so either we remove the local_neuron variant or we find a way to share the volume between the container running pytests and the one spawned by pytests

oOraph · 2024-04-08T10:48:39Z

I had to deactivate/remove all tests related to aws-neuron/gpt2-neuronx-bs4-seqlen1024 because of the neuronx-cc upgrade v2.13.xxx

Need this to workaround the model static params, for the docker entrypoint to adapt tgi environment accordingly to the specified model This will make usage of the image easier: default params (e.g not specifying anything) should be enough for most models Signed-off-by: Raphael Glon <[email protected]>

Signed-off-by: Raphael Glon <[email protected]>

…del within an image built on the flight Signed-off-by: Raphael Glon <[email protected]>

Signed-off-by: Raphael Glon <[email protected]>

dacorvo

LGTM, thanks. Before merging, could you:

bump dev version,
use a single workflow for TGI,
simplify a bit the implicit env test to just check you received a response to a single request.

bump dev version, use a single workflow for TGI, simplify a bit the implicit env test Signed-off-by: Raphael Glon <[email protected]>

HuggingFaceDocBuilderDev · 2024-04-09T08:18:48Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

dacorvo

Thank you very much for the pull-request !

.github/workflows/test_inf2_tgi.yml

dacorvo · 2024-04-09T08:18:05Z

text-generation-inference/Dockerfile

@@ -77,6 +80,9 @@ RUN VERBOSE=1 BUILDDIR=/pyserver/build PROTODIR=/pyserver/proto VERSION=${VERSIO
 # Neuron base image (used for deployment)
 FROM base AS neuron

+ARG VERSION


Do we really need to repeat VERSION multiple times ?

yep I wasn't even able to fully explain it: without it, on github ci the docker build process failed as you can see here:
https://github.com/huggingface/optimum-neuron/actions/runs/8552046964/job/23432354170
my guess is it's due to a buildkit version... not 100% sure though

Signed-off-by: Raphael Glon <[email protected]>

oOraph · 2024-04-09T08:39:13Z

Side note: I bumped the version to 0.0.22.dev0. This will temporarily break integration tests (as there are no compatible cached models for the CI yet: gpt 2 compiled with neuronx-cc 2.13.66.0+6dfecc895 on 1 or 2 cores)

oOraph marked this pull request as draft March 26, 2024 17:15

dacorvo mentioned this pull request Mar 27, 2024

Bugs for TGI-NeuronX DLC for llama-2 70B on Trn1.32xlarge #466

Closed

oOraph force-pushed the dev/tgi_env branch 2 times, most recently from e84cfcc to 07910bd Compare March 29, 2024 14:20

oOraph marked this pull request as ready for review March 29, 2024 14:20

oOraph force-pushed the dev/tgi_env branch from bb2710e to 0ec77a4 Compare March 29, 2024 14:22

oOraph force-pushed the dev/tgi_env branch from 0ec77a4 to 876fdb8 Compare April 3, 2024 13:27

dacorvo requested changes Apr 3, 2024

View reviewed changes

text-generation-inference/tgi_env.py Outdated Show resolved Hide resolved

text-generation-inference/tgi_env.py Outdated Show resolved Hide resolved

text-generation-inference/tgi_env.py Outdated Show resolved Hide resolved

text-generation-inference/tgi_env.py Outdated Show resolved Hide resolved

oOraph force-pushed the dev/tgi_env branch from 876fdb8 to 79ad6b6 Compare April 4, 2024 07:19

oOraph requested a review from dacorvo April 4, 2024 08:42

oOraph force-pushed the dev/tgi_env branch from 3f32df9 to 43514ac Compare April 4, 2024 08:45

oOraph marked this pull request as draft April 4, 2024 12:19

oOraph force-pushed the dev/tgi_env branch 4 times, most recently from 7aca4a7 to 5bf0f49 Compare April 8, 2024 10:47

oOraph marked this pull request as ready for review April 8, 2024 10:47

oOraph force-pushed the dev/tgi_env branch from 5bf0f49 to a54c973 Compare April 8, 2024 15:10

oOraph added 8 commits April 8, 2024 17:10

fixes

b5d747b

Signed-off-by: Raphael Glon <[email protected]>

fixes

2ad1648

Signed-off-by: Raphael Glon <[email protected]>

minor: logging

6506ebc

Signed-off-by: Raphael Glon <[email protected]>

Integration tests for inf2 + tgi_env wrapper

40c87b8

Signed-off-by: Raphael Glon <[email protected]>

Github ci worklow

c3affba

Signed-off-by: Raphael Glon <[email protected]>

To run on github ci we cannot share a volume but need to embed the mo…

d426ea1

…del within an image built on the flight Signed-off-by: Raphael Glon <[email protected]>

More flexible on expected outputs

89844ca

Signed-off-by: Raphael Glon <[email protected]>

Be more flexible about compiler version

637c83f

Signed-off-by: Raphael Glon <[email protected]>

oOraph force-pushed the dev/tgi_env branch from a54c973 to 637c83f Compare April 8, 2024 15:10

dacorvo reviewed Apr 9, 2024

View reviewed changes

Refacto

f7e8d12

bump dev version, use a single workflow for TGI, simplify a bit the implicit env test Signed-off-by: Raphael Glon <[email protected]>

dacorvo approved these changes Apr 9, 2024

View reviewed changes

Misc fixes

377c534

Signed-off-by: Raphael Glon <[email protected]>

dacorvo mentioned this pull request Apr 9, 2024

chore: bump dev version #558

Closed

3 tasks

dacorvo merged commit bb1cc96 into huggingface:main Apr 9, 2024
8 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set up tgi environment values with the ones used to build the model #529

Set up tgi environment values with the ones used to build the model #529

oOraph commented Mar 26, 2024

oOraph commented Mar 29, 2024

dacorvo left a comment

oOraph commented Apr 4, 2024

oOraph commented Apr 4, 2024

oOraph commented Apr 8, 2024

dacorvo left a comment

HuggingFaceDocBuilderDev commented Apr 9, 2024

dacorvo left a comment

dacorvo Apr 9, 2024

oOraph Apr 9, 2024

oOraph commented Apr 9, 2024 •

edited

Loading

Set up tgi environment values with the ones used to build the model #529

Set up tgi environment values with the ones used to build the model #529

Conversation

oOraph commented Mar 26, 2024

oOraph commented Mar 29, 2024

dacorvo left a comment

Choose a reason for hiding this comment

oOraph commented Apr 4, 2024

oOraph commented Apr 4, 2024

oOraph commented Apr 8, 2024

dacorvo left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Apr 9, 2024

dacorvo left a comment

Choose a reason for hiding this comment

dacorvo Apr 9, 2024

Choose a reason for hiding this comment

oOraph Apr 9, 2024

Choose a reason for hiding this comment

oOraph commented Apr 9, 2024 • edited Loading

oOraph commented Apr 9, 2024 •

edited

Loading