加载tokenizer的时候报oserror #1448

csy94cfy · 2024-01-14T06:15:40Z

csy94cfy
Jan 14, 2024

实在不知道怎么解决，无论开代理或者关代理都是报这个oserror的错误，如果有大佬知道如何解决，救救孩子，以下是具体报错

tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm3-6b", trust_remote_code = True)

`---------------------------------------------------------------------------
OSError Traceback (most recent call last)
File [c:\Users](file:///C:/Users/)钰\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\tokenization_utils_base.py:2260, in PreTrainedTokenizerBase._from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, token, cache_dir, local_files_only, _commit_hash, _is_local, *init_inputs, **kwargs)
2259 try:
-> 2260 tokenizer = cls(*init_inputs, **init_kwargs)
2261 except OSError:

File ~.cache\huggingface\modules\transformers_modules\THUDM\chatglm3-6b\b098244a71fbe69ce149682d9072a7629f7e908c\tokenization_chatglm.py:109, in ChatGLMTokenizer.init(self, vocab_file, padding_side, clean_up_tokenization_spaces, encode_special_tokens, **kwargs)
108 self.vocab_file = vocab_file
--> 109 self.tokenizer = SPTokenizer(vocab_file)
110 self.special_tokens = {
111 "": self.tokenizer.bos_id,
112 "": self.tokenizer.eos_id,
113 "": self.tokenizer.pad_id,
114 "": self.tokenizer.pad_id
115 }

File ~.cache\huggingface\modules\transformers_modules\THUDM\chatglm3-6b\b098244a71fbe69ce149682d9072a7629f7e908c\tokenization_chatglm.py:18, in SPTokenizer.init(self, model_path)
17 assert os.path.isfile(model_path), model_path
---> 18 self.sp_model = SentencePieceProcessor(model_file=model_path)
20 # BOS / EOS token IDs

File [c:\Users](file:///C:/Users/)钰\AppData\Local\Programs\Python\Python310\lib\site-packages\sentencepiece_init_.py:447, in SentencePieceProcessor.Init(self, model_file, model_proto, out_type, add_bos, add_eos, reverse, emit_unk_piece, enable_sampling, nbest_size, alpha, num_threads)
446 if model_file or model_proto:
--> 447 self.Load(model_file=model_file, model_proto=model_proto)
...
2269 "Special tokens have been added in the vocabulary, make sure the associated word embeddings are"
2270 " fine-tuned or trained."
2271 )

OSError: Unable to load vocabulary from file. Please check that the provided vocabulary is accessible and not corrupted.
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

加载tokenizer的时候报oserror #1448

{{title}}

Replies: 0 comments

Select a reply

加载tokenizer的时候报oserror #1448

csy94cfy Jan 14, 2024

Replies: 0 comments

csy94cfy
Jan 14, 2024