open-compass · tonysy · Sep 19, 2024 · Sep 19, 2024
diff --git a/README.md b/README.md
@@ -59,7 +59,8 @@ Just like a compass guides us on our journey, OpenCompass will guide you through
 
 ## 🚀 What's New <a><img width="35" height="20" src="https://user-images.githubusercontent.com/12782558/212848161-5e783dd6-11e8-4fe0-bbba-39ffb77730be.png"></a>
 
-- **\[2024.09.05\]** We now support OpenAI o1(`o1-mini-2024-09-12` and `o1-preview-2024-09-12`). Feel free to give them a try! 🔥🔥🔥
+- **\[2024.09.19\]** We now support [Qwen2.5](https://huggingface.co/Qwen)(0.5B to 72B) with multiple backend(huggingface/vllm/lmdeploy). Feel free to give them a try! 🔥🔥🔥
+- **\[2024.09.17\]** We now support OpenAI o1(`o1-mini-2024-09-12` and `o1-preview-2024-09-12`). Feel free to give them a try! 🔥🔥🔥
 - **\[2024.09.05\]** We now support answer extraction through model post-processing to provide a more accurate representation of the model's capabilities. As part of this update, we have integrated [XFinder](https://github.com/IAAR-Shanghai/xFinder) as our first post-processing model. For more detailed information, please refer to the [documentation](opencompass/utils/postprocessors/xfinder/README.md), and give it a try! 🔥🔥🔥
 - **\[2024.08.20\]** OpenCompass now supports the [SciCode](https://github.com/scicode-bench/SciCode): A Research Coding Benchmark Curated by Scientists. 🔥🔥🔥
 - **\[2024.08.16\]** OpenCompass now supports the brand new long-context language model evaluation benchmark — [RULER](https://arxiv.org/pdf/2404.06654). RULER provides an evaluation of long-context including retrieval, multi-hop tracing, aggregation, and question answering through flexible configurations. Check out the [RULER](configs/datasets/ruler/README.md) evaluation config now! 🔥🔥🔥

diff --git a/README_zh-CN.md b/README_zh-CN.md
@@ -59,6 +59,7 @@
 
 ## 🚀 最新进展 <a><img width="35" height="20" src="https://user-images.githubusercontent.com/12782558/212848161-5e783dd6-11e8-4fe0-bbba-39ffb77730be.png"></a>
 
+- **\[2024.09.19\]** 现已支持[Qwen2.5](https://huggingface.co/Qwen)(0.5B to 72B) ，可以使用多种推理后端(huggingface/vllm/lmdeploy), 欢迎尝试! 🔥🔥🔥
 - **\[2024.09.05\]** 现已支持OpenAI o1 模型(`o1-mini-2024-09-12` and `o1-preview-2024-09-12`), 欢迎尝试! 🔥🔥🔥
 - **\[2024.09.05\]** OpenCompass 现在支持通过模型后处理来进行答案提取，以更准确地展示模型的能力。作为此次更新的一部分，我们集成了 [XFinder](https://github.com/IAAR-Shanghai/xFinder) 作为首个后处理模型。具体信息请参阅 [文档](opencompass/utils/postprocessors/xfinder/README.md)，欢迎尝试！ 🔥🔥🔥
 - **\[2024.08.20\]** OpenCompass 现已支持 [SciCode](https://github.com/scicode-bench/SciCode): A Research Coding Benchmark Curated by Scientists。 🔥🔥🔥

diff --git a/configs/models/qwen2_5/hf_qwen2_5_0_5b_instruct.py b/configs/models/qwen2_5/hf_qwen2_5_0_5b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-0.5b-instruct-hf',
+        path='Qwen/Qwen2.5-0.5B-Instruct',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/hf_qwen2_5_14b_instruct.py b/configs/models/qwen2_5/hf_qwen2_5_14b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-14b-instruct-hf',
+        path='Qwen/Qwen2.5-14B-Instruc',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=2),
+    )
+]
diff --git a/configs/models/qwen2_5/hf_qwen2_5_1_5b_instruct.py b/configs/models/qwen2_5/hf_qwen2_5_1_5b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-1.5b-instruct-hf',
+        path='Qwen/Qwen2.5-1.5B-Instruct',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/hf_qwen2_5_32b_instruct.py b/configs/models/qwen2_5/hf_qwen2_5_32b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-32b-instruct-hf',
+        path='Qwen/Qwen2.5-32B-Instruc',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=2),
+    )
+]
diff --git a/configs/models/qwen2_5/hf_qwen2_5_3b_instruct.py b/configs/models/qwen2_5/hf_qwen2_5_3b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-3b-instruct-hf',
+        path='Qwen/Qwen2.5-3B-Instruct',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/hf_qwen2_5_72b_instruct.py b/configs/models/qwen2_5/hf_qwen2_5_72b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-72b-instruct-hf',
+        path='Qwen/Qwen2.5-72B-Instruc',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=4),
+    )
+]
diff --git a/configs/models/qwen2_5/hf_qwen2_5_7b_instruct.py b/configs/models/qwen2_5/hf_qwen2_5_7b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-7b-instruct-hf',
+        path='Qwen/Qwen2.5-7B-Instruc',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/lmdeploy_qwen2_5_0_5b_instruct.py b/configs/models/qwen2_5/lmdeploy_qwen2_5_0_5b_instruct.py
@@ -0,0 +1,15 @@
+from opencompass.models import TurboMindModelwithChatTemplate
+
+models = [
+    dict(
+        type=TurboMindModelwithChatTemplate,
+        abbr='qwen2.5-0.5b-instruct-turbomind',
+        path='Qwen/Qwen2.5-0.5B-Instruct',
+        engine_config=dict(session_len=16384, max_batch_size=16, tp=1),
+        gen_config=dict(top_k=1, temperature=1e-6, top_p=0.9, max_new_tokens=4096),
+        max_seq_len=16384,
+        max_out_len=4096,
+        batch_size=16,
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/lmdeploy_qwen2_5_14b_instruct.py b/configs/models/qwen2_5/lmdeploy_qwen2_5_14b_instruct.py
@@ -0,0 +1,15 @@
+from opencompass.models import TurboMindModelwithChatTemplate
+
+models = [
+    dict(
+        type=TurboMindModelwithChatTemplate,
+        abbr='qwen2.5-14b-instruct-turbomind',
+        path='Qwen/Qwen2.5-14B-Instruct',
+        engine_config=dict(session_len=16384, max_batch_size=16, tp=2),
+        gen_config=dict(top_k=1, temperature=1e-6, top_p=0.9, max_new_tokens=4096),
+        max_seq_len=16384,
+        max_out_len=4096,
+        batch_size=16,
+        run_cfg=dict(num_gpus=2),
+    )
+]
diff --git a/configs/models/qwen2_5/lmdeploy_qwen2_5_1_5b_instruct.py b/configs/models/qwen2_5/lmdeploy_qwen2_5_1_5b_instruct.py
@@ -0,0 +1,15 @@
+from opencompass.models import TurboMindModelwithChatTemplate
+
+models = [
+    dict(
+        type=TurboMindModelwithChatTemplate,
+        abbr='qwen2.5-1.5b-instruct-turbomind',
+        path='Qwen/Qwen2.5-1.5B-Instruct',
+        engine_config=dict(session_len=16384, max_batch_size=16, tp=1),
+        gen_config=dict(top_k=1, temperature=1e-6, top_p=0.9, max_new_tokens=4096),
+        max_seq_len=16384,
+        max_out_len=4096,
+        batch_size=16,
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/lmdeploy_qwen2_5_32b_instruct.py b/configs/models/qwen2_5/lmdeploy_qwen2_5_32b_instruct.py
@@ -0,0 +1,15 @@
+from opencompass.models import TurboMindModelwithChatTemplate
+
+models = [
+    dict(
+        type=TurboMindModelwithChatTemplate,
+        abbr='qwen2.5-32b-instruct-turbomind',
+        path='Qwen/Qwen2.5-32B-Instruct',
+        engine_config=dict(session_len=16384, max_batch_size=16, tp=2),
+        gen_config=dict(top_k=1, temperature=1e-6, top_p=0.9, max_new_tokens=4096),
+        max_seq_len=16384,
+        max_out_len=4096,
+        batch_size=16,
+        run_cfg=dict(num_gpus=2),
+    )
+]
diff --git a/configs/models/qwen2_5/lmdeploy_qwen2_5_3b_instruct.py b/configs/models/qwen2_5/lmdeploy_qwen2_5_3b_instruct.py
@@ -0,0 +1,15 @@
+from opencompass.models import TurboMindModelwithChatTemplate
+
+models = [
+    dict(
+        type=TurboMindModelwithChatTemplate,
+        abbr='qwen2.5-3b-instruct-turbomind',
+        path='Qwen/Qwen2.5-3B-Instruct',
+        engine_config=dict(session_len=16384, max_batch_size=16, tp=1),
+        gen_config=dict(top_k=1, temperature=1e-6, top_p=0.9, max_new_tokens=4096),
+        max_seq_len=16384,
+        max_out_len=4096,
+        batch_size=16,
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/lmdeploy_qwen2_5_72b_instruct.py b/configs/models/qwen2_5/lmdeploy_qwen2_5_72b_instruct.py
@@ -0,0 +1,15 @@
+from opencompass.models import TurboMindModelwithChatTemplate
+
+models = [
+    dict(
+        type=TurboMindModelwithChatTemplate,
+        abbr='qwen2.5-72b-instruct-turbomind',
+        path='Qwen/Qwen2.5-72B-Instruct',
+        engine_config=dict(session_len=16384, max_batch_size=16, tp=4),
+        gen_config=dict(top_k=1, temperature=1e-6, top_p=0.9, max_new_tokens=4096),
+        max_seq_len=16384,
+        max_out_len=4096,
+        batch_size=16,
+        run_cfg=dict(num_gpus=4),
+    )
+]
diff --git a/configs/models/qwen2_5/lmdeploy_qwen2_5_7b_instruct.py b/configs/models/qwen2_5/lmdeploy_qwen2_5_7b_instruct.py
@@ -0,0 +1,15 @@
+from opencompass.models import TurboMindModelwithChatTemplate
+
+models = [
+    dict(
+        type=TurboMindModelwithChatTemplate,
+        abbr='qwen2.5-7b-instruct-turbomind',
+        path='Qwen/Qwen2.5-7B-Instruct',
+        engine_config=dict(session_len=16384, max_batch_size=16, tp=1),
+        gen_config=dict(top_k=1, temperature=1e-6, top_p=0.9, max_new_tokens=4096),
+        max_seq_len=16384,
+        max_out_len=4096,
+        batch_size=16,
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/vllm_qwen2_5_0_5b_instruct.py b/configs/models/qwen2_5/vllm_qwen2_5_0_5b_instruct.py
@@ -0,0 +1,14 @@
+from opencompass.models import VLLMwithChatTemplate
+
+models = [
+    dict(
+        type=VLLMwithChatTemplate,
+        abbr='qwen2.5-0.5b-instruct-vllm',
+        path='Qwen/Qwen2.5-0.5B-Instruct',
+        model_kwargs=dict(tensor_parallel_size=1, gpu_memory_utilization=0.5),
+        max_out_len=4096,
+        batch_size=16,
+        generation_kwargs=dict(temperature=0),
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/vllm_qwen2_5_14b_instruct.py b/configs/models/qwen2_5/vllm_qwen2_5_14b_instruct.py
@@ -0,0 +1,14 @@
+from opencompass.models import VLLMwithChatTemplate
+
+models = [
+    dict(
+        type=VLLMwithChatTemplate,
+        abbr='qwen2.5-14b-instruct-vllm',
+        path='Qwen/Qwen2.5-14B-Instruct',
+        model_kwargs=dict(tensor_parallel_size=2),
+        max_out_len=4096,
+        batch_size=16,
+        generation_kwargs=dict(temperature=0),
+        run_cfg=dict(num_gpus=2),
+    )
+]
diff --git a/configs/models/qwen2_5/vllm_qwen2_5_1_5b_instruct.py b/configs/models/qwen2_5/vllm_qwen2_5_1_5b_instruct.py
@@ -0,0 +1,14 @@
+from opencompass.models import VLLMwithChatTemplate
+
+models = [
+    dict(
+        type=VLLMwithChatTemplate,
+        abbr='qwen2.5-1.5b-instruct-vllm',
+        path='Qwen/Qwen2.5-1.5B-Instruct',
+        model_kwargs=dict(tensor_parallel_size=1, gpu_memory_utilization=0.5),
+        max_out_len=4096,
+        batch_size=16,
+        generation_kwargs=dict(temperature=0),
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/vllm_qwen2_5_32b_instruct.py b/configs/models/qwen2_5/vllm_qwen2_5_32b_instruct.py
@@ -0,0 +1,14 @@
+from opencompass.models import VLLMwithChatTemplate
+
+models = [
+    dict(
+        type=VLLMwithChatTemplate,
+        abbr='qwen2.5-32b-instruct-vllm',
+        path='Qwen/Qwen2.5-32B-Instruct',
+        model_kwargs=dict(tensor_parallel_size=2),
+        max_out_len=4096,
+        batch_size=16,
+        generation_kwargs=dict(temperature=0),
+        run_cfg=dict(num_gpus=2),
+    )
+]
diff --git a/configs/models/qwen2_5/vllm_qwen2_5_3b_instruct.py b/configs/models/qwen2_5/vllm_qwen2_5_3b_instruct.py
@@ -0,0 +1,14 @@
+from opencompass.models import VLLMwithChatTemplate
+
+models = [
+    dict(
+        type=VLLMwithChatTemplate,
+        abbr='qwen2.5-3b-instruct-vllm',
+        path='Qwen/Qwen2.5-3B-Instruct',
+        model_kwargs=dict(tensor_parallel_size=1, gpu_memory_utilization=0.5),
+        max_out_len=4096,
+        batch_size=16,
+        generation_kwargs=dict(temperature=0),
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/configs/models/qwen2_5/vllm_qwen2_5_72b_instruct.py b/configs/models/qwen2_5/vllm_qwen2_5_72b_instruct.py
@@ -0,0 +1,14 @@
+from opencompass.models import VLLMwithChatTemplate
+
+models = [
+    dict(
+        type=VLLMwithChatTemplate,
+        abbr='qwen2_5-72b-instruct-vllm',
+        path='Qwen/Qwen2.5-72B-Instruct',
+        model_kwargs=dict(tensor_parallel_size=4),
+        max_out_len=4096,
+        batch_size=16,
+        generation_kwargs=dict(temperature=0),
+        run_cfg=dict(num_gpus=4),
+    )
+]
diff --git a/configs/models/qwen2_5/vllm_qwen2_5_7b_instruct.py b/configs/models/qwen2_5/vllm_qwen2_5_7b_instruct.py
@@ -0,0 +1,14 @@
+from opencompass.models import VLLMwithChatTemplate
+
+models = [
+    dict(
+        type=VLLMwithChatTemplate,
+        abbr='qwen2.5-7b-instruct-vllm',
+        path='Qwen/Qwen2.5-7B-Instruct',
+        model_kwargs=dict(tensor_parallel_size=1),
+        max_out_len=4096,
+        batch_size=16,
+        generation_kwargs=dict(temperature=0),
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/opencompass/configs/models/qwen2_5/hf_qwen2_5_0_5b_instruct.py b/opencompass/configs/models/qwen2_5/hf_qwen2_5_0_5b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-0.5b-instruct-hf',
+        path='Qwen/Qwen2.5-0.5B-Instruct',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/opencompass/configs/models/qwen2_5/hf_qwen2_5_14b_instruct.py b/opencompass/configs/models/qwen2_5/hf_qwen2_5_14b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-14b-instruct-hf',
+        path='Qwen/Qwen2.5-14B-Instruc',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=2),
+    )
+]
diff --git a/opencompass/configs/models/qwen2_5/hf_qwen2_5_1_5b_instruct.py b/opencompass/configs/models/qwen2_5/hf_qwen2_5_1_5b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-1.5b-instruct-hf',
+        path='Qwen/Qwen2.5-1.5B-Instruct',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=1),
+    )
+]
diff --git a/opencompass/configs/models/qwen2_5/hf_qwen2_5_32b_instruct.py b/opencompass/configs/models/qwen2_5/hf_qwen2_5_32b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-32b-instruct-hf',
+        path='Qwen/Qwen2.5-32B-Instruc',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=2),
+    )
+]
diff --git a/opencompass/configs/models/qwen2_5/hf_qwen2_5_3b_instruct.py b/opencompass/configs/models/qwen2_5/hf_qwen2_5_3b_instruct.py
@@ -0,0 +1,12 @@
+from opencompass.models import HuggingFacewithChatTemplate
+
+models = [
+    dict(
+        type=HuggingFacewithChatTemplate,
+        abbr='qwen2.5-3b-instruct-hf',
+        path='Qwen/Qwen2.5-3B-Instruct',
+        max_out_len=4096,
+        batch_size=8,
+        run_cfg=dict(num_gpus=1),
+    )
+]