Skip to content

Commit

Permalink
Minor tweaks for GPU batching support
Browse files Browse the repository at this point in the history
  • Loading branch information
kristiankielhofner committed Oct 24, 2023
1 parent 4c52baf commit 50fa0a7
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
7 changes: 5 additions & 2 deletions nginx/nginx.conf
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ http {
# Websocket support
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# Support very long sessions for GPU batching of large files
proxy_read_timeout 1800;

# Use HTTP 1.1 keepalives to backend gunicorn
upstream keepalive-wis {
Expand All @@ -44,8 +46,9 @@ http {
keepalive_timeout 3600s;
}

# Increase max client body size for ASR file uploads, etc. 100MB matches Cloudflare
client_max_body_size 100M;
# Increase max client body size for ASR file uploads, etc.
# Default to very large to support GPU batching of long audio files.
client_max_body_size 2G;

server {
listen 19001;
Expand Down
2 changes: 1 addition & 1 deletion settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ class APISettings(BaseSettings):
chunking_memory_threshold: int = 3798205849

# Maximum number of chunks that are loaded into the GPU at once
# This will need to be tweaked based on GPU ram
# This will need to be tweaked based on GPU ram and model used.
# 8GB GPUs should support at least 2 chunks so starting with that
concurrent_gpu_chunks: int = 2

Expand Down

0 comments on commit 50fa0a7

Please sign in to comment.