We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
进行微调的时候,尝试复现ADGEN数据集任务,在运行bash train.sh过程中出现此错误
执行
import torch print(torch.cuda.is_available())
得到的结果为True
C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\transformers\optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning warnings.warn( input_ids [5, 65421, 61, 67329, 32, 98339, 61, 72043, 32, 65347, 61, 70872, 32, 69768, 61, 68944, 32, 67329, 64103, 61, 96914, 130001, 130004, 5, 87052, 96914, 81471, 64562, 65759, 64493, 64988, 6, 65840, 65388, 74531, 63825, 75786, 64009, 63823, 65626, 63882, 64619, 65388, 6, 64480, 65604, 85646, 110945, 10, 64089, 65966, 87052, 67329, 65544, 6, 71964, 70533, 64417, 63862, 89978, 63991, 63823, 77284, 88473, 64219, 63848, 112012, 6, 71231, 65099, 71252, 66800, 85768, 64566, 64338, 100323, 75469, 63823, 117317, 64218, 64257, 64051, 74197, 6, 63893, 130005, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3] inputs ▒▒▒▒#▒▒*▒▒▒▒#▒▒▒▒*▒▒▒#▒Ը▒*ͼ▒▒#▒▒▒▒*▒▒▒▒#▒▒▒ȿ▒ ▒▒▒ɵ▒▒▒▒ȿ▒▒▒▒▒▒▒▒▒▒▒▒▒۲▒▒▒,▒▒▒▒ʱ▒д▒▒˵▒▒▒ͷ▒▒▒▒▒Ͼ▒▒ô▒ʱ▒▒,˭▒▒▒ܴ▒▒▒▒ȳ▒2▒▒Ч▒▒▒▒▒ɵĿ▒▒▒,▒▒Ȼ▒▒▒▒▒▒С▒▒▒ְ▒▒▒▒▒▒▒▒▒▒▒▒▒Ȼ▒▒▒▒▒▒,▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒а▒▒▒▒ա▒ϵ▒▒▒▒▒▒▒▒▒▒▒▒ƿ▒▒▒,▒▒ label_ids [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 130004, 5, 87052, 96914, 81471, 64562, 65759, 64493, 64988, 6, 65840, 65388, 74531, 63825, 75786, 64009, 63823, 65626, 63882, 64619, 65388, 6, 64480, 65604, 85646, 110945, 10, 64089, 65966, 87052, 67329, 65544, 6, 71964, 70533, 64417, 63862, 89978, 63991, 63823, 77284, 88473, 64219, 63848, 112012, 6, 71231, 65099, 71252, 66800, 85768, 64566, 64338, 100323, 75469, 63823, 117317, 64218, 64257, 64051, 74197, 6, 63893, 130005, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100] labels <image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100> ▒▒▒ɵ▒▒▒▒ȿ▒▒▒▒▒▒▒▒▒▒▒▒▒۲▒▒▒,▒▒▒▒ʱ▒д▒▒˵▒▒▒ͷ▒▒▒▒▒Ͼ▒▒ô▒ʱ▒▒,˭▒▒▒ܴ▒▒▒▒ȳ▒2▒▒Ч▒▒▒▒▒ɵĿ▒▒▒,▒▒Ȼ▒▒▒▒▒▒С▒▒▒ְ▒▒▒▒▒▒▒▒▒▒▒▒▒Ȼ▒▒▒▒▒▒,▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒а▒▒▒▒ա▒ϵ▒▒▒▒▒▒▒▒▒▒▒▒ƿ▒▒▒,▒▒<image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100> 0%| | 0/3000 [00:00<?, ?it/s]03/23/2024 23:23:53 - WARNING - transformers_modules.chatglm-6b-int4.modeling_chatglm - `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`... Traceback (most recent call last): File "D:\GLM\ChatGLM-6B-main\ptuning\main.py", line 430, in <module> main() File "D:\GLM\ChatGLM-6B-main\ptuning\main.py", line 369, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 1635, in train return inner_training_loop( File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 1904, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 2647, in training_step loss = self.compute_loss(model, inputs) File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 2679, in compute_loss outputs = model(**inputs) File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "C:\Users\firmament/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\modeling_chatglm.py", line 1190, in forward transformer_outputs = self.transformer( File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "C:\Users\firmament/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\modeling_chatglm.py", line 930, in forward past_key_values = self.get_prompt(batch_size=input_ids.shape[0], device=input_ids.device, File "C:\Users\firmament/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\modeling_chatglm.py", line 878, in get_prompt past_key_values = self.dropout(past_key_values) File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\dropout.py", line 58, in forward return F.dropout(input, self.p, self.training, self.inplace) File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\functional.py", line 1266, in dropout return _VF.dropout_(input, p, training) if inplace else _VF.dropout(input, p, training) RuntimeError: "bernoulli_scalar_cpu_" not implemented for 'Half' 0%| | 0/3000 [00:00<?, ?it/s]
No response
将ADGEN数据集文件夹放入ptuning文件夹 在ptuning文件夹运行bash trains.sh 出现错误
- OS: windows11 - Python:3.10 - Transformers: 4.27.1 - PyTorch: 2.2.1+cu121 - CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True
The text was updated successfully, but these errors were encountered:
是不是我的电脑跑不动?
Sorry, something went wrong.
我觉得你应该先讲一下你显卡的型号显存 同时查一下自己的显卡是不是支持模型量化(我记得在根目录的readme有提示)
默认配置是量化到int4的 显存需求很低 而且你提示也不是oom 应该可以排除爆显存的可能(至少这一步报错的时候还不是)
我有个建议是你去把量化的参数改成fp16的(直接删掉也行) 不量化模型只是显存占用大些 速度能快好多 一是因为加载过程不用量化 二是fp16训练推理最快(我的测试中训练时间fp16<<int4<int8)
顺便一提 我的配置是4张tesla t4 16g显存 能跑所有p-tuning但是全量微调会爆显存 软件版本是
- Python:3.9.19 - Transformers: 4.27.1 - PyTorch: 1.3.1+cu116 - CUDA: 11.6
因为服务器没办法更新 另一个微调的环境需要transformers>=4.30 我还花了很久解决依赖问题依赖地狱 所以对依赖版本印象特别深
transformers>=4.30
实在不行你可以试试和我的配置保持一致 管他那么多先跑通再说
顺便我是Linux跑的 要不你也试试找个服务器
看看你用的代码是不是最新的 这个报错应该是说有个标量不能用半精度实现 如果最新的代码还是报同样的错误 你可以试试把报错的代码中half()这种半精度量化的过程修改去除 如果你修改了代码 需要的显存大概会提升 而且量化到int的操作可能也会跟着变化 所以不推荐改你理解做什么的代码 也不推荐修改代码之后再进行int量化了 #462
half()
No branches or pull requests
Is there an existing issue for this?
Current Behavior
进行微调的时候,尝试复现ADGEN数据集任务,在运行bash train.sh过程中出现此错误
执行
得到的结果为True
Expected Behavior
No response
Steps To Reproduce
将ADGEN数据集文件夹放入ptuning文件夹
在ptuning文件夹运行bash trains.sh
出现错误
Environment
Anything else?
No response
The text was updated successfully, but these errors were encountered: