Slow inference - Run Whisper API without extra encoding/downloading file, and use bytes directly. #26

VirajVaitha123 · 2023-01-28T22:29:42Z

Hi,

The GPU performance is similar to my CPU for small-medium videos due to extra io processing/encoding.

For my application, I use FastAPI for majority of my core functionality. However, I require a GPU to transcribe video/audio files to retrieve transcriptions, and decided to use Bananaml as serverless GPU seems like it would be much cheaper than Kubernetes.

How can I pass this UploadFile = File(...) object from FastAPI (spooled temporary file) to my bananaml API, instead of sending an encoded byte string from reading a mp3/mp4 file that is saved locally.

Old way (Faster on my CPU compared to bananaML)

Upload video on web page -> write file into temporary file -> pass to Whisper

New Way (GPU with BananaML)

Upload video on web page-> Save file locally ->Read bytes from file locally -> decode to json format -> pass to whisper.

I get there has to be an extra io operation to send the video information to the GPU, but the way recommended in the template is highly efficient, I wish I could pass the the file like object as done with FastAPI

Thanks,

Viraj

VirajVaitha123 changed the title ~~Best Practice: Run Whisper API without pointing to a file locally~~ Slow inference - Run Whisper API without extra encoding/downloading file, and use bytes directly. Jan 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow inference - Run Whisper API without extra encoding/downloading file, and use bytes directly. #26

Slow inference - Run Whisper API without extra encoding/downloading file, and use bytes directly. #26

VirajVaitha123 commented Jan 28, 2023 •

edited

Loading

Slow inference - Run Whisper API without extra encoding/downloading file, and use bytes directly. #26

Slow inference - Run Whisper API without extra encoding/downloading file, and use bytes directly. #26

Comments

VirajVaitha123 commented Jan 28, 2023 • edited Loading

VirajVaitha123 commented Jan 28, 2023 •

edited

Loading