Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the analysis of model Qwen1.5-0.5B #24

Open
qxpBlog opened this issue Apr 13, 2024 · 0 comments
Open

How to get the analysis of model Qwen1.5-0.5B #24

qxpBlog opened this issue Apr 13, 2024 · 0 comments

Comments

@qxpBlog
Copy link

qxpBlog commented Apr 13, 2024

@mvpatel2000 @cli99 @weimingzha0 @digger-yu @BhAem I want to get the analysis info Time to first token (s)Time for completion (s) and Tokens/second about the model Qwen1.5-0.5B , so do I just need to run the following command :

HF_ENDPOINT=https://hf-mirror.com
gpu_name='a100-sxm-80gb'
dtype_name="w16a16e16"
output_dir='outputs_infer'
model_name=Qwen/Qwen1.5-0.5B
batch_size_per_gpu=1
tp_size=2
output_file_suffix="-bs${batch_size_per_gpu}"
cost_per_gpu_hour=2.21
seq_len=128
num_tokens_to_generate=242
flops_efficiency=0.7
hbm_memory_efficiency=0.9
achieved_tflops=200                # will overwrite the flops_efficiency above
achieved_memory_bandwidth_GBs=1200 # will overwrite the hbm_memory_efficiency above

if [[ ! -e $output_dir ]]; then
    mkdir $output_dir
elif [[ ! -d $output_dir ]]; then
    echo "$output_dir already exists but is not a directory" 1>&2
fi

HF_ENDPOINT=$HF_ENDPOINT CUDA_VISIBLE_DEVICES=3 python -m llm_analysis.analysis infer --model_name=${model_name} --gpu_name=${gpu_name} --dtype_name=${dtype_name} -output_dir=${output_dir} --output-file-suffix=${output_file_suffix} \
    --seq_len=${seq_len} --num_tokens_to_generate=${num_tokens_to_generate} --batch_size_per_gpu=${batch_size_per_gpu} \
    --tp_size=${tp_size} \
    --cost_per_gpu_hour=${cost_per_gpu_hour} \
    --flops_efficiency=${flops_efficiency} --hbm_memory_efficiency=${hbm_memory_efficiency} --log_level DEBUG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant