Skip to content

Issues: vllm-project/vllm

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[Bug]: Using FlashInfer with FP8 model with FP8 KV cache produces an error bug Something isn't working
#8641 opened Sep 19, 2024 by Syst3m1cAn0maly
1 task done
[Performance]: The accept rate of typical acceptance sampling performance Performance-related issues
#8639 opened Sep 19, 2024 by hustxiayang
1 task done
[Usage]: Ray + vLLM OpenAI (offline) Batch Inference usage How to use vllm
#8636 opened Sep 19, 2024 by mbuet2ner
1 task done
[Bug]: memory leak bug Something isn't working
#8629 opened Sep 19, 2024 by wciq1208
1 task done
[Bug]: Speculative decoding interferes with CPU-only execution bug Something isn't working
#8628 opened Sep 19, 2024 by NickLucche
1 task done
[Bug]: MistralTokenizer Detokenization Issue bug Something isn't working
#8627 opened Sep 19, 2024 by ywang96
1 task done
[Usage]: doesn't work on pascal tesla P100 usage How to use vllm
#8626 opened Sep 19, 2024 by Stargate256
1 task done
[Bug]: Wrong "completion_tokens" counts in streaming usage bug Something isn't working
#8625 opened Sep 19, 2024 by yuhon0528
1 task done
[Bug]: vllm deploy medusa, draft acceptance rate: 0.000 bug Something isn't working
#8620 opened Sep 19, 2024 by xhjcxxl
[Usage]: Number of requests currently in the queue usage How to use vllm
#8617 opened Sep 19, 2024 by shubh9m
1 task done
[Usage]: Standalone Debugging and Measuring the vLLM Engine Backend usage How to use vllm
#8586 opened Sep 19, 2024 by htang2012
1 task done
[Usage]: How to run VLLM on multiple tpu hosts V4-32 usage How to use vllm
#8582 opened Sep 18, 2024 by sparsh35
1 task done
[Feature]: DRY Sampling feature request
#8581 opened Sep 18, 2024 by Shreyansh1311
1 task done
[Bug]: Wrong Response with Gemma2 with 8k context length bug Something isn't working
#8580 opened Sep 18, 2024 by hahmad2008
[Bug]: lm-format-enforcer guided decoding kills MQLLMEngine bug Something isn't working
#8578 opened Sep 18, 2024 by joerunde
1 task done
[Usage]: usage How to use vllm
#8569 opened Sep 18, 2024 by lauhaide
1 task done
ProTip! Follow long discussions with comments:>50.