[BUG/Help] <Out of range: piece id is out of range.> #438

LiuChen19960902 · 2023-04-07T06:18:10Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

今天早上跟着hf里面更新的一遍模型和几个py文件，然后就开始出这个错误了

Expected Behavior

今天早上跟着hf里面更新的一遍模型和几个py文件，然后就开始出这个错误了

Steps To Reproduce

OS:
Python:
Transformers:
PyTorch:
CUDA Support (python -c "import torch; print(torch.cuda.is_available())") :

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

The text was updated successfully, but these errors were encountered:

natureLanguageQing · 2023-04-07T08:11:39Z

我也遇到了这个问题

duzx16 · 2023-04-07T08:42:41Z

如果是在ptuning里遇到这个问题的话，请重新pull一下本仓库的代码

FeiWard · 2023-04-08T06:10:56Z

重新pull之后还是一样的错误

zy86603465 · 2023-04-11T06:40:53Z

我也遇到同样的问题，怎么解决呢？

genius0182 · 2023-04-11T06:45:39Z

我发现https://huggingface.co/THUDM/chatglm-6b/blob/main/configuration_chatglm.py上的这里

和https://huggingface.co/THUDM/chatglm-6b/blob/main/config.json

数据不一样。不知道是不是这里的问题。 @duzx16

zy86603465 · 2023-04-11T07:41:01Z

@genius0182
我都改成
"bos_token_id": 130004,
"eos_token_id": 130005,
"mask_token_id": 130000,
"gmask_token_id": 130001,
试了一下，还是报原来的错

Data2Me · 2023-04-11T09:15:59Z

我也遇到了这个问题

zy86603465 · 2023-04-11T10:33:20Z

这个问题我解决了，把所有的模型文件和配置及代码更新一下就可以了

duzx16 · 2023-04-12T02:03:04Z

我发现https://huggingface.co/THUDM/chatglm-6b/blob/main/configuration_chatglm.py上的这里和https://huggingface.co/THUDM/chatglm-6b/blob/main/config.json 数据不一样。不知道是不是这里的问题。 @duzx16

图1是默认的值，图2是实际设置的值，不冲突。

kingtigerc · 2023-04-12T08:26:34Z

我今天新拉的代码和模型，也发生这个问题。这是什么原因导致的?

kingtigerc · 2023-04-12T08:47:37Z

重新pull,重新拉模型，重新pip install -r requirements.txt,解决了

lrx1213 · 2023-04-14T01:25:41Z

遇到了同样的问题

JimXiongGM · 2023-04-19T02:36:32Z

遇到了同样的问题，应该是tokenizer decode时候有bug，但是再次生成，有时候又不会有问题。

可以尝试加try绕过这个问题：

  retry_cnt=0
  while retry_cnt < 5:
      try:
          response, history = model.chat(
              tokenizer,
              query,
              history=[],
              max_length=512,
              num_beams=5,
              do_sample=True,
              top_p=0.7,
              temperature=0.95,
          )
          break
      except Exception as e1:
          retry_cnt += 1

zhangyuanscall · 2023-05-18T02:15:47Z

就是decode源码有问题，我用同一个权重版本的chatglm，尝试多种解码参数，有的解码参数没问题，有的解码参数就有问题

ysanimals · 2023-08-09T07:16:29Z

用的是另外的仓库微调的代码，请问应该如何解决这个decode出现问题

Doufanfan · 2023-09-10T13:33:55Z

+1, decode源码的这个问题要怎么解决。。。

godcrying · 2023-11-03T08:00:20Z

所以，这个issue为什么关闭了？

ryzn0518 · 2023-11-16T03:24:34Z

所以，这个issue为什么关闭了？

+1 我也遇到了

yecphaha · 2023-11-21T09:50:08Z

参考 https://huggingface.co/THUDM/chatglm3-6b/commit/ea563876364622a0a5c24e6b71db0b93a9861ba0#d2h-069285
在 tokenization_chatglm.py 里新增两行代码

ayrnb · 2023-11-30T09:36:34Z

+1

yecphaha · 2023-12-01T01:22:27Z

+1

可以参考 https://huggingface.co/THUDM/chatglm3-6b/commit/ea563876364622a0a5c24e6b71db0b93a9861ba0#d2h-069285
在 tokenization_chatglm.py 里新增两行代码

greyovo · 2023-12-20T07:32:16Z

参考 https://huggingface.co/THUDM/chatglm3-6b/commit/ea563876364622a0a5c24e6b71db0b93a9861ba0#d2h-069285 在 tokenization_chatglm.py 里新增两行代码

有用。再明确一些的话，对于ChatGLM-6B来说，则是在 tokenization_chatglm.py 中：

class TextTokenizer:
    def __init__(self, model_path):
        self.sp = spm.SentencePieceProcessor()
        self.sp.Load(model_path)
        self.num_tokens = self.sp.vocab_size()
   # .... 省略其他方法

    def convert_id_to_token(self, idx):
        if idx > self.num_tokens:  # 在这里添加判断
            return ""
        return self.sp.IdToPiece(idx)

duzx16 pinned this issue Apr 7, 2023

duzx16 mentioned this issue Apr 7, 2023

[BUG/Help] <title>train.sh报错：IndexError: Out of range: piece id is out of range. #452

Closed

1 task

duzx16 closed this as completed Apr 12, 2023

duzx16 mentioned this issue Apr 12, 2023

[BUG/Help] <IndexError: Out of range: piece id is out of range.> #527

Closed

1 task

duzx16 mentioned this issue Apr 12, 2023

[BUG/Help] <title>IndexError: 用p-tuning微调后模型推理报错：Out of range: piece id is out of range. #387

Closed

1 task

yao8839836 mentioned this issue Nov 25, 2023

IndexError: piece id is out of range. yao8839836/kg-llm#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG/Help] <Out of range: piece id is out of range.> #438

[BUG/Help] <Out of range: piece id is out of range.> #438

LiuChen19960902 commented Apr 7, 2023

natureLanguageQing commented Apr 7, 2023

duzx16 commented Apr 7, 2023

FeiWard commented Apr 8, 2023

zy86603465 commented Apr 11, 2023

genius0182 commented Apr 11, 2023

zy86603465 commented Apr 11, 2023

Data2Me commented Apr 11, 2023

zy86603465 commented Apr 11, 2023

duzx16 commented Apr 12, 2023

kingtigerc commented Apr 12, 2023

kingtigerc commented Apr 12, 2023

lrx1213 commented Apr 14, 2023

JimXiongGM commented Apr 19, 2023

zhangyuanscall commented May 18, 2023

ysanimals commented Aug 9, 2023

Doufanfan commented Sep 10, 2023

godcrying commented Nov 3, 2023

ryzn0518 commented Nov 16, 2023

yecphaha commented Nov 21, 2023

ayrnb commented Nov 30, 2023

yecphaha commented Dec 1, 2023

greyovo commented Dec 20, 2023

[BUG/Help] <Out of range: piece id is out of range.> #438

[BUG/Help] <Out of range: piece id is out of range.> #438

Comments

LiuChen19960902 commented Apr 7, 2023

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

natureLanguageQing commented Apr 7, 2023

duzx16 commented Apr 7, 2023

FeiWard commented Apr 8, 2023

zy86603465 commented Apr 11, 2023

genius0182 commented Apr 11, 2023

zy86603465 commented Apr 11, 2023

Data2Me commented Apr 11, 2023

zy86603465 commented Apr 11, 2023

duzx16 commented Apr 12, 2023

kingtigerc commented Apr 12, 2023

kingtigerc commented Apr 12, 2023

lrx1213 commented Apr 14, 2023

JimXiongGM commented Apr 19, 2023

zhangyuanscall commented May 18, 2023

ysanimals commented Aug 9, 2023

Doufanfan commented Sep 10, 2023

godcrying commented Nov 3, 2023

ryzn0518 commented Nov 16, 2023

yecphaha commented Nov 21, 2023

ayrnb commented Nov 30, 2023

yecphaha commented Dec 1, 2023

greyovo commented Dec 20, 2023