Skip to content

Latest commit

 

History

History
49 lines (38 loc) · 2.32 KB

README.md

File metadata and controls

49 lines (38 loc) · 2.32 KB

COLT

The implementation for CIKM 2024: Towards Completeness-Oriented Tool Retrieval for Large Language Models.

News

  • [2024/8/17] The processed ToolBench dataset and checkpoints of first-stage semantic learning are released! Please find the processed datasets and checkpoints on HuggingFace.
  • [2024/7/17] Our code and ToolLens dataset is released.
  • [2024/7/16] COLT is accepted by CIKM 2024.
  • [2024/5/25] Our paper is released.

Quick Start

  1. Download PLMs from Huggingface and make a folder with the name PLMs
  • ANCE: The PLM and description is avaliable here.
  • TAS-B: The PLM and description is avaliable here.
  • co-Condensor: The PLM and description is avaliable here.
  • Contriever: The PLM and description is avaliable here.
  1. Run Semantic Learning:

    python train_sbert.py

  2. Run Collaborative Learning:

    python train.py -g 0 -m COLT -d ToolLens

You can specify the gpu id, the used dataset by cmd line arguments.

Environment

Our experimental environment is shown below:

numpy version: 1.21.6
pandas version: 1.3.5
torch version: 1.13.1

Semantic Learning

For the first phase of semantic learning, you need to install the BEIR. BEIR is a widely used information retrieval framework. You can find the BEIR code repository at this link.

Citation

If you find our code or work useful for your research, please cite our work.

@inproceedings{qu2024towards,
  title={Towards Completeness-Oriented Tool Retrieval for Large Language Models},
  author={Qu, Changle and Dai, Sunhao and Wei, Xiaochi and Cai, Hengyi and Wang, Shuaiqiang and Yin, Dawei and Xu, Jun and Wen, Ji-Rong},
  booktitle={Proceedings of the 33rd ACM International Conference on Information and Knowledge Management},
  pages={1930--1940},
  year={2024}
}