Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Neuron SDK to 2.18.0 and TGI to 1.4.5 (fix) #548

Closed
wants to merge 2 commits into from
Closed

Upgrade Neuron SDK to 2.18.0 and TGI to 1.4.5 (fix) #548

wants to merge 2 commits into from

Conversation

davidshtian
Copy link
Contributor

What does this PR do?

  • Upgrade Neuron SDK to 2.18.0
  • Upgrade TGI from 1.4.1 to 1.4.5, otherwise the container image build "make neuronx-tgi" will fail, with the errors:
 => ERROR [builder 10/10] RUN cargo build --release --workspace --exclude benchmark                                                                     12.6s
 => [neuron  1/12] RUN apt-get update -y  && apt-get install -y --no-install-recommends     gnupg2     wget     && rm -rf /var/lib/apt/lists/*     &&   12.4s
 => [pyserver 1/7] RUN apt-get update -y  && apt-get install -y --no-install-recommends     make     python3-venv     && rm -rf /var/lib/apt/lists/*    12.4s
 => [neuron  2/12] RUN echo "deb https://apt.repos.neuron.amazonaws.com jammy main" > /etc/apt/sources.list.d/neuron.list                                0.3s
 => [pyserver 2/7] RUN install -d /pyserver                                                                                                              0.4s
------
 > [builder 10/10] RUN cargo build --release --workspace --exclude benchmark:
0.372 info: syncing channel updates for '1.75.0-x86_64-unknown-linux-gnu'
0.479 info: latest update on 2023-12-28, rust version 1.75.0 (82e1608df 2023-12-21)
0.502 info: downloading component 'clippy'
0.519 info: downloading component 'rustfmt'
0.544 info: installing component 'clippy'
0.802 info: installing component 'rustfmt'
1.134 warning: excluded package(s) `benchmark` not found in workspace `/usr/src`
1.316    Compiling text-generation-client v1.4.1 (/usr/src/router/client)
1.317    Compiling grpc-metadata v0.1.0 (/usr/src/router/grpc-metadata)
1.321    Compiling text-generation-router v1.4.1 (/usr/src/router)
1.322    Compiling text-generation-launcher v1.4.1 (/usr/src/launcher)
2.635    Compiling text-generation-benchmark v1.4.1 (/usr/src/benchmark)
3.734 error[E0432]: unresolved import `nix::sys::signal::Signal`
3.734  --> launcher/src/main.rs:2:30
3.734   |
3.734 2 | use nix::sys::signal::{self, Signal};
3.734   |                              ^^^^^^ no `Signal` in `sys::signal`
3.734   |
3.734   = help: consider importing this type alias instead:
3.734           ctrlc::Signal
3.734
3.735 error[E0432]: unresolved import `nix::unistd::Pid`
3.735    --> launcher/src/main.rs:3:5
3.735     |
3.735 3   | use nix::unistd::Pid;
3.735     |     ^^^^^^^^^^^^^^^^ no `Pid` in `unistd`
3.735     |
3.735 note: found an item that was configured out
3.735    --> /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/nix-0.27.1/src/unistd.rs:183:12
3.735     |
3.735 183 | pub struct Pid(pid_t);
3.735     |            ^^^
3.735     = note: the item is gated behind the `process` feature
3.735
3.750 error[E0425]: cannot find function `kill` in module `signal`
3.750     --> launcher/src/main.rs:1172:13
3.750      |
3.750 1172 |     signal::kill(Pid::from_raw(process.id() as i32), Signal::SIGTERM).unwrap();
3.750      |             ^^^^ not found in `signal`
3.750      |
3.750 note: found an item that was configured out
3.750     --> /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/nix-0.27.1/src/sys/signal.rs:974:8
3.750      |
3.750 974  | pub fn kill<T: Into<Option<Signal>>>(pid: Pid, signal: T) -> Result<()> {
3.750      |        ^^^^
3.750      = note: the item is gated behind the `signal` feature
3.750
4.020 Some errors have detailed explanations: E0425, E0432.
4.020 For more information about an error, try `rustc --explain E0425`.
4.026 error: could not compile `text-generation-launcher` (bin "text-generation-launcher") due to 3 previous errors
4.026 warning: build failed, waiting for other jobs to finish...
------
Dockerfile:42
--------------------
  40 |     COPY --from=tgi /tgi/router router
  41 |     COPY --from=tgi /tgi/launcher launcher
  42 | >>> RUN cargo build --release --workspace --exclude benchmark
  43 |
  44 |     # Python base image
--------------------
ERROR: failed to solve: process "/bin/sh -c cargo build --release --workspace --exclude benchmark" did not complete successfully: exit code: 101
make: *** [Makefile:46: neuronx-tgi] Error 1

Fixes # (issue)
Upgrade the version will fix the issue.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@dacorvo
Copy link
Collaborator

dacorvo commented Apr 2, 2024

Thank you for this pull-request, which is almost perfect except it does not update optimum-neuron itself.
Please see our own pull-request to bump AWS Neuron SDK version.

@dacorvo dacorvo closed this Apr 2, 2024
@dacorvo
Copy link
Collaborator

dacorvo commented Apr 2, 2024

See #547

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants