-
Notifications
You must be signed in to change notification settings - Fork 17
Issues: OpenCSGs/llm-inference
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Api server blocked when one request is in-process
bug
Something isn't working
#137
opened May 9, 2024 by
SeanHH86
GGUF implements will make duplicate copy since cannot detect config.json file in the cache folder
#123
opened Apr 24, 2024 by
depenglee1707
vllm implements cannot support download model from repo besides hg
#120
opened Apr 23, 2024 by
depenglee1707
Add inference SDK for invoke
enhancement
New feature or request
#103
opened Apr 17, 2024 by
SeanHH86
Requested tokens (817) exceed context window of 512
bug
Something isn't working
#99
opened Apr 16, 2024 by
SeanHH86
Support load Qwen1.5-72B-Chat-GPTQ-Int4 by auto_gptq
enhancement
New feature or request
#68
opened Apr 3, 2024 by
SeanHH86
No default value for "timeout" if missing "batch_wait_timeout_s: 0" in yaml config
#48
opened Mar 25, 2024 by
depenglee1707
inference gradio web reponse random words for deepseek instrcuct model
#37
opened Mar 20, 2024 by
KinglyWayne
ProTip!
no:milestone will show everything without a milestone.