llm_inference_DLM

Usage:

python pyt_llm_inference_DLM.py --model_path TheBloke/Llama-2-7B-Chat-fp16 --platform MI210 --precision float16 --iters 10 --batch_size_list 1 --prompt_len_list 128 512 --new_tokens_list 128 --csv_out test.csv --backend pyt

TheBloke/Llama-2-7B-Chat-fp16,56.77510506766183, PREFILL  batch_size 1 prompt_len 128 new_tokens 128

TheBloke/Llama-2-7B-Chat-fp16,37.85624433931538, DECODING batch_size 1 prompt_len 128 new_tokens 128

TheBloke/Llama-2-7B-Chat-fp16,169.09446934291296, PREFILL  batch_size 1 prompt_len 512 new_tokens 128

TheBloke/Llama-2-7B-Chat-fp16,39.33803215927965, DECODING batch_size 1 prompt_len 512 new_tokens 128

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

llm_inference_DLM

Files

README.md

Latest commit

History

README.md

File metadata and controls

llm_inference_DLM