Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

双卡A6000推理,模型推理结束,一张卡GPU利用率为0,一张卡GPU利用率100% #109

Open
zf761 opened this issue Aug 28, 2024 · 1 comment

Comments

@zf761
Copy link

zf761 commented Aug 28, 2024

1724837489680

TOKENIZER_PATH=/DATA/LM_zhangfeng/models/Qwen2-72B-Instruct-AWQ CHECKPOINT_PATH=/DATA/LM_zhangfeng/models/Qwen2-72B-Instruct-AWQ MODEL_TYPE=qwen_2 FT_SERVER_TEST=1 CUDA_VISIBLE_DEVICES='2,3' START_PORT='18095' ENABLE_FAST_GEN=1 CONCURRENCY_LIMIT=200 PY_LOG_LEVEL=INFO TP_SIZE=2 WORLD_SIZE=2 python3 -m maga_transformer.start_server

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@zf761 and others