LLM Inference Speeds

This repository contains benchmark data for various Large Language Models (LLM) based on their inference speeds measured in tokens per second. The benchmarks are performed across different hardware configurations using the prompt "Give me 1 line phrase".

About the Data

The data represents the performance of several LLMs, detailing the tokens processed per second on specific hardware setups. Each entry includes the model name, the hardware used, and the measured speed.

Explore the Benchmarks

You can view and interact with the benchmark data through a searchable table on our GitHub Pages site. Use the search field to filter by model name and explore different hardware performances.

View the Inference Speeds Table

Contributing

Contributions to the benchmark data are welcome! Please refer to the contributing guidelines for more information on how you can contribute.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
data.js		data.js
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Inference Speeds

About the Data

Explore the Benchmarks

Contributing

License

About

Releases

Packages

Languages

dmatora/LLM-inference-speed-benchmarks

Folders and files

Latest commit

History

Repository files navigation

LLM Inference Speeds

About the Data

Explore the Benchmarks

Contributing

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages