You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running Visual Studio 2022 (latest update) and CUDA 12.6 on a Dell T5600 with a pair of GeForce 1050 Ti GPUs (which I realize are old Pascal chips) and Windows 10 (latest update). I compiled WhisperCpp, ggml, and SLD2 without issue (as static libs) and tested using the command.cpp demo console app.
Fun app. Worked fine, but performance was sluggish. So I set GGML_USE_CUDA and recompiled with CUDA and cuBLAS in a bid to improve performance. After a bit of trial-and-error getting everything to compile and link, I was able to test again with command.exe. Unfortunately, whisper.cpp is now crashing at "model.ctx = ggml_init(params);" at around line 1620. Execution never gets to "if (!model.ctx)" so the error "ggml_init() failed" is not displayed.
It seems like an issue with memory allocation, the value of "n_tensors * ggml_tensor_overhead()" in "params" but I'm not sure about the value of "n_tensors" because it contains hard coded values (i.e. 10 + 15 + 15 * n_audio_layer + 24 * n_text_layer). What do 10, 15, 15, and 24 represent? The proper allocation of memory with ggml_init() seems important, but this appears an odd way to calculate it.
Or, am I chasing the wrong problem? Any suggestions would be most appreciated. Thanks!
UPDATE: I've been able to confirm that ggml is crashing WhisperCpp in ggml.c at this line:
"float f = ggml_table_f32_f16[i] = GGML_COMPUTE_FP16_TO_FP32(u.fp16);"
..which is in the function ggml_init() at around line 3469. The above offending line is around line 3500 in this function, in ggml.c. Not sure why this would be an issue when CUDA is enabled, but not when CUDA is not used.???
The text was updated successfully, but these errors were encountered:
CassinianSoftware
changed the title
Whisper crashes calling ggml_init() uising CUDA
Whisper crashes calling ggml_init() using CUDA
Sep 18, 2024
CassinianSoftware
changed the title
Whisper crashes calling ggml_init() using CUDA
Whisper crashes calling ggml_init() with CUDA enabled
Sep 18, 2024
I'm running Visual Studio 2022 (latest update) and CUDA 12.6 on a Dell T5600 with a pair of GeForce 1050 Ti GPUs (which I realize are old Pascal chips) and Windows 10 (latest update). I compiled WhisperCpp, ggml, and SLD2 without issue (as static libs) and tested using the command.cpp demo console app.
Fun app. Worked fine, but performance was sluggish. So I set GGML_USE_CUDA and recompiled with CUDA and cuBLAS in a bid to improve performance. After a bit of trial-and-error getting everything to compile and link, I was able to test again with command.exe. Unfortunately, whisper.cpp is now crashing at "model.ctx = ggml_init(params);" at around line 1620. Execution never gets to "if (!model.ctx)" so the error "ggml_init() failed" is not displayed.
It seems like an issue with memory allocation, the value of "n_tensors * ggml_tensor_overhead()" in "params" but I'm not sure about the value of "n_tensors" because it contains hard coded values (i.e. 10 + 15 + 15 * n_audio_layer + 24 * n_text_layer). What do 10, 15, 15, and 24 represent? The proper allocation of memory with ggml_init() seems important, but this appears an odd way to calculate it.
Or, am I chasing the wrong problem? Any suggestions would be most appreciated. Thanks!
UPDATE: I've been able to confirm that ggml is crashing WhisperCpp in ggml.c at this line:
"float f = ggml_table_f32_f16[i] = GGML_COMPUTE_FP16_TO_FP32(u.fp16);"
..which is in the function ggml_init() at around line 3469. The above offending line is around line 3500 in this function, in ggml.c. Not sure why this would be an issue when CUDA is enabled, but not when CUDA is not used.???
The text was updated successfully, but these errors were encountered: