Skip to content

Latest commit

 

History

History
36 lines (25 loc) · 1.93 KB

README.md

File metadata and controls

36 lines (25 loc) · 1.93 KB

CA-LoRA

Adapting Existing LoRA for Compressed LLMs to Enable Efficient Multi-Tasking on Personal Devices

Introduction

This repository has the source code for the paper CA-LoRA, accepted at COLM 2024.

Considering that the open-source community has already contributed many LoRAs to LLMs, we propose to adapt these existing LoRAs from the LLMs to their compressed version and introduce a Compression-Aware LoRA (CA-LoRA) framework. We incorporate knowledge inheritance and recovery strategies to recover the lost knowledge caused by model compression. Experiment results demonstrate that CA-LoRA outperforms the vanilla LoRA methods applied to a compressed LLM and achieves comparable performance to the non-compressed LLM with existing LoRA modules.

Repo Content

This repo contains the code to reproduce the experimental results in our paper.

Citation

Please cite our paper if you find our work valuable.

@article{zhao2024calora,
      title={CA-LoRA: Adapting Existing LoRA for Compressed LLMs to Enable Efficient Multi-Tasking on Personal Devices}, 
      author={Weilin Zhao and Yuxiang Huang and Xu Han and Zhiyuan Liu and Zhengyan Zhang and Kuai Li and Chen Chen and Tao Yang and Maosong Sun},
      journal={arXiv preprint arXiv:2307.07705},
      year={2024}, 
}