First load time in Nvidia Jetson AGX Xavier and Orin is more than 10 minutes #2402

deeprobo-dev · 2024-09-02T14:29:46Z

Hi,
I am trying to use whisper cpp inference for production purpose and tested it in:

NVIDIA Jetson AGX Xavier:
gpu compute capability: 7.2
architecture: aarch64
shared gpu ram of 32GB
NVIDIA AGX orin:
gpu compute capability: 8.7
architecture: aarch64
shared gpu ram of 64GB

For both devices it is taking more than 10 minutes to load which is becoming a bottleneck for me to use it in production even though inference time is good enough. I am running base model and compiled whisper server in cuda with corresponding compute capabilities.

But the same is taking very less time in my laptop with configuration:
processor: intel i7, 11gen
gpu: NVIDIA RTX 2060 (6GB)
cpu ram: 32GB
architecture: x86_64

Can you guys please help me with figuring out and solving the issue. Thanks in advance.

deeprobo-dev · 2024-09-27T12:20:06Z

Found it to be a cuda issue. In cuda 11.4 first loading was slow irrespective of system architecture. When upgraded cuda it worked fine both in local as well as inside docker container. Tested with cuda 11.7, 12.2.

aleksas · 2024-09-28T18:29:25Z

@deeprobo-dev would be great if you could run benchmark both on xavier and orin and post results here . It would be great to have idea how performance compares to other hardware.

deeprobo-dev changed the title ~~First load time in Nvidia Jetson and Orin is more than 10 minutes~~ First load time in Nvidia Jetson AGX Xavier and Orin is more than 10 minutes Sep 2, 2024

deeprobo-dev closed this as completed Sep 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First load time in Nvidia Jetson AGX Xavier and Orin is more than 10 minutes #2402

First load time in Nvidia Jetson AGX Xavier and Orin is more than 10 minutes #2402

deeprobo-dev commented Sep 2, 2024 •

edited

Loading

deeprobo-dev commented Sep 27, 2024

aleksas commented Sep 28, 2024

First load time in Nvidia Jetson AGX Xavier and Orin is more than 10 minutes #2402

First load time in Nvidia Jetson AGX Xavier and Orin is more than 10 minutes #2402

Comments

deeprobo-dev commented Sep 2, 2024 • edited Loading

deeprobo-dev commented Sep 27, 2024

aleksas commented Sep 28, 2024

deeprobo-dev commented Sep 2, 2024 •

edited

Loading