Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow inference - Run Whisper API without extra encoding/downloading file, and use bytes directly. #26

Open
VirajVaitha123 opened this issue Jan 28, 2023 · 0 comments

Comments

@VirajVaitha123
Copy link

VirajVaitha123 commented Jan 28, 2023

Hi,

The GPU performance is similar to my CPU for small-medium videos due to extra io processing/encoding.

For my application, I use FastAPI for majority of my core functionality. However, I require a GPU to transcribe video/audio files to retrieve transcriptions, and decided to use Bananaml as serverless GPU seems like it would be much cheaper than Kubernetes.

How can I pass this UploadFile = File(...) object from FastAPI (spooled temporary file) to my bananaml API, instead of sending an encoded byte string from reading a mp3/mp4 file that is saved locally.

Old way (Faster on my CPU compared to bananaML)

Upload video on web page -> write file into temporary file -> pass to Whisper

New Way (GPU with BananaML)

Upload video on web page-> Save file locally ->Read bytes from file locally -> decode to json format -> pass to whisper.

I get there has to be an extra io operation to send the video information to the GPU, but the way recommended in the template is highly efficient, I wish I could pass the the file like object as done with FastAPI

Thanks,

Viraj

@VirajVaitha123 VirajVaitha123 changed the title Best Practice: Run Whisper API without pointing to a file locally Slow inference - Run Whisper API without extra encoding/downloading file, and use bytes directly. Jan 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant