Skip to content

Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.

License

Notifications You must be signed in to change notification settings

BestAnHongjun/LMDeploy-Jetson

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LMDeploy-Jetson Community

Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.

[中文] | [English]

This project focuses on adapting LMDeploy for use with NVIDIA Jetson series edge computing cards, facilitating the implementation of InternLM series LLMs for Offline Embodied Intelligence (OEI).

Latest News🎉

  • [2024/3/15] Updated suppoort for LMDeploy-v0.2.5.
  • [2024/2/26] This project has been included in the LMDeploy community.

Community Recruitment

  • Recruiting community managers (Contact: [email protected])
  • Recruiting benchmark testing data for more models of Jetson boards (please PR directly), such as:
    • Jetson Nano
    • Jetson TX2
    • Jetson AGX Xavier
    • Jetson Orin Nano
    • Jetson AGX Orin
  • Recruiting developers to create Jetson-specific whl distributions
  • README optimization, etc.

Verified model/platform

  • ✅:Verified and runnable
  • ❌:Verified but not runnable
  • ⭕️:Pending verification
Models InternLM-7B InternLM-20B InternLM2-1.8B InternLM2-7B InternLM2-20B
Orin AGX(32G)
Jetpack 5.1

Mem:??/??
14.68 token/s

Mem:??/??
5.82 token/s

Mem:??/??
56.57 token/s

Mem:??/??
14.56 token/s

Mem:??/??
6.16 token/s
Orin NX(16G)
Jetpack 5.1

Mem:8.6G/16G
7.39 token/s

Mem:14.7G/16G
3.08 token/s

Mem:5.6G/16G
22.96 token/s

Mem:9.2G/16G
7.48 token/s

Mem:14.8G/16G
3.19 token/s
Xavier NX(8G)
Jetpack 5.1

Mem:4.35G/8G
28.36 token/s

If you have more Jetson series boards, feel free to run benchmarks and submit the results via Pull Requests (PR) to become one of the community contributors!

Future Work

  • Updating benchmark testing data for more models of Jetson boards.
  • Creating Jetson-specific whl distributions.
  • Following up on updates to the LMDeploy version.

Tutorial

S1.Quantize on server by W4A16

S2.Install Miniconda on Jetson

S3.Install CMake-3.29.0 on Jetson

S4.Install RapidJson on Jetson

S5.Install Pytorch-2.1.0 on Jetson

S6.Port LMDeploy-0.2.5 to Jetson

S7.Run InternLM offline on Jetson

Appendix

Community Projects

  • InternDog: Offline embodied intelligent guide dog based on the InternLM2. [Github] [Bilibili]

Citation

If this project is helpful to your work, please cite it using the following format:

@misc{2024lmdeployjetson,
    title={LMDeploy-Jetson:Opening a new era of Offline Embodied Intelligence},
    author={LMDeploy-Jetson Community},
    url={https://github.com/BestAnHongjun/LMDeploy-Jetson},
    year={2024}
}

Acknowledgements

About

Deploying LLMs offline on the NVIDIA Jetson platform marks the dawn of a new era in embodied intelligence, where devices can function independently without continuous internet access.

Resources

License

Stars

Watchers

Forks

Packages

No packages published