The implementation for CIKM 2024: Towards Completeness-Oriented Tool Retrieval for Large Language Models.
- [2024/8/17] The processed ToolBench dataset and checkpoints of first-stage semantic learning are released! Please find the processed datasets and checkpoints on HuggingFace.
- [2024/7/17] Our code and ToolLens dataset is released.
- [2024/7/16] COLT is accepted by CIKM 2024.
- [2024/5/25] Our paper is released.
- Download PLMs from Huggingface and make a folder with the name PLMs
- ANCE: The PLM and description is avaliable here.
- TAS-B: The PLM and description is avaliable here.
- co-Condensor: The PLM and description is avaliable here.
- Contriever: The PLM and description is avaliable here.
- Run Semantic Learning:
python train_sbert.py
- Run Collaborative Learning:
python train.py -g 0 -m COLT -d ToolLens
You can specify the gpu id, the used dataset by cmd line arguments.
Our experimental environment is shown below:
numpy version: 1.21.6
pandas version: 1.3.5
torch version: 1.13.1
For the first phase of semantic learning, you need to install the BEIR. BEIR is a widely used information retrieval framework. You can find the BEIR code repository at this link.
If you find our code or work useful for your research, please cite our work.
@inproceedings{qu2024towards,
title={Towards Completeness-Oriented Tool Retrieval for Large Language Models},
author={Qu, Changle and Dai, Sunhao and Wei, Xiaochi and Cai, Hengyi and Wang, Shuaiqiang and Yin, Dawei and Xu, Jun and Wen, Ji-Rong},
booktitle={Proceedings of the 33rd ACM International Conference on Information and Knowledge Management},
pages={1930--1940},
year={2024}
}