Error for AGIEval when using fewshot #2323

BaohaoLiao · 2024-09-19T13:06:06Z

Hi, I meet the following error when I evaluate on AGIEval with num_fewshot=3. However, everything works normally with 0-shot.

2024-09-19:13:03:47,189 DEBUG    [cache.py:33] requests-agieval_jec_qa_kd-3shot-rank0-world_size1-tokenizer is not cached, generating...
2024-09-19:13:03:47,189 INFO     [task.py:423] Building contexts for agieval_jec_qa_kd on rank 0...
 12%|████████████▍                                                                                            | 119/1000 [00:00<00:01, 480.91it/s]
Traceback (most recent call last):
  File "/home/baliao/.conda/envs/cluster/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/baliao/.conda/envs/cluster/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/data/chatgpt/data/baliao/cluster/04_lm_eval/lm-evaluation-harness/lm_eval/__main__.py", line 468, in <module>
    cli_evaluate()
  File "/data/chatgpt/data/baliao/cluster/04_lm_eval/lm-evaluation-harness/lm_eval/__main__.py", line 389, in cli_evaluate
    results = evaluator.simple_evaluate(
  File "/data/chatgpt/data/baliao/cluster/04_lm_eval/lm-evaluation-harness/lm_eval/utils.py", line 397, in _wrapper
    return fn(*args, **kwargs)
  File "/data/chatgpt/data/baliao/cluster/04_lm_eval/lm-evaluation-harness/lm_eval/evaluator.py", line 301, in simple_evaluate
    results = evaluate(
  File "/data/chatgpt/data/baliao/cluster/04_lm_eval/lm-evaluation-harness/lm_eval/utils.py", line 397, in _wrapper
    return fn(*args, **kwargs)
  File "/data/chatgpt/data/baliao/cluster/04_lm_eval/lm-evaluation-harness/lm_eval/evaluator.py", line 420, in evaluate
    task.build_all_requests(
  File "/data/chatgpt/data/baliao/cluster/04_lm_eval/lm-evaluation-harness/lm_eval/api/task.py", line 446, in build_all_requests
    fewshot_ctx = self.fewshot_context(
  File "/data/chatgpt/data/baliao/cluster/04_lm_eval/lm-evaluation-harness/lm_eval/utils.py", line 397, in _wrapper
    return fn(*args, **kwargs)
  File "/data/chatgpt/data/baliao/cluster/04_lm_eval/lm-evaluation-harness/lm_eval/api/task.py", line 1088, in fewshot_context
    labeled_examples += self.sampler.get_context(doc, num_fewshot)
  File "/data/chatgpt/data/baliao/cluster/04_lm_eval/lm-evaluation-harness/lm_eval/api/samplers.py", line 89, in get_context
    str(doc_target[0])
IndexError: list index out of range

Here is how I run the code:

MODEL=/path/to/Llama-2-7b-hf

accelerate launch --num_processes 1 -m lm_eval \
    --model hf \
    --model_args pretrained=$MODEL,trust_remote_code=True \
    --batch_size 16 \
    --verbosity DEBUG \
    --tasks agieval \
    --num_fewshot 3

Version: lm_eval 0.4.4

The text was updated successfully, but these errors were encountered:

baberabb · 2024-09-19T20:02:22Z

Hi! So the dataset we are using is missing the fewshot split. It uses the test split for the fewshot samples and looks like one of the rows in agieval_jec_qa_kd is missing the answer field. We have some logic to handle that when its an evaluation question, but not when its in the fewshot. I'll try looking into it!

baberabb added bug Something isn't working. validation For validation of task implementations. labels Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error for AGIEval when using fewshot #2323

Error for AGIEval when using fewshot #2323

BaohaoLiao commented Sep 19, 2024 •

edited

Loading

baberabb commented Sep 19, 2024 •

edited

Loading

Error for AGIEval when using fewshot #2323

Error for AGIEval when using fewshot #2323

Comments

BaohaoLiao commented Sep 19, 2024 • edited Loading

baberabb commented Sep 19, 2024 • edited Loading

BaohaoLiao commented Sep 19, 2024 •

edited

Loading

baberabb commented Sep 19, 2024 •

edited

Loading