You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For both devices it is taking more than 10 minutes to load which is becoming a bottleneck for me to use it in production even though inference time is good enough. I am running base model and compiled whisper server in cuda with corresponding compute capabilities.
But the same is taking very less time in my laptop with configuration:
processor: intel i7, 11gen
gpu: NVIDIA RTX 2060 (6GB)
cpu ram: 32GB
architecture: x86_64
Can you guys please help me with figuring out and solving the issue. Thanks in advance.
The text was updated successfully, but these errors were encountered:
deeprobo-dev
changed the title
First load time in Nvidia Jetson and Orin is more than 10 minutes
First load time in Nvidia Jetson AGX Xavier and Orin is more than 10 minutes
Sep 2, 2024
Found it to be a cuda issue. In cuda 11.4 first loading was slow irrespective of system architecture. When upgraded cuda it worked fine both in local as well as inside docker container. Tested with cuda 11.7, 12.2.
@deeprobo-dev would be great if you could run benchmark both on xavier and orin and post results here . It would be great to have idea how performance compares to other hardware.
Hi,
I am trying to use whisper cpp inference for production purpose and tested it in:
gpu compute capability: 7.2
architecture: aarch64
shared gpu ram of 32GB
gpu compute capability: 8.7
architecture: aarch64
shared gpu ram of 64GB
For both devices it is taking more than 10 minutes to load which is becoming a bottleneck for me to use it in production even though inference time is good enough. I am running base model and compiled whisper server in cuda with corresponding compute capabilities.
But the same is taking very less time in my laptop with configuration:
processor: intel i7, 11gen
gpu: NVIDIA RTX 2060 (6GB)
cpu ram: 32GB
architecture: x86_64
Can you guys please help me with figuring out and solving the issue. Thanks in advance.
The text was updated successfully, but these errors were encountered: