Run large language models entirely in the web browser 🚀
Utilizes transformers.js to implement transformer-based language models in JavaScript and onnxruntime-web to efficiently run these models in the browser via WebAssembly (and soon webGPU).
(Note: TinyLlama-1.1B-Chat-v1.0 issue will be resolved in the upcoming v3 release of transformers.js)