You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please tell me why I am reporting a cuda memory error when I perform data preprocessing according to your operation. Please help me. I will be grateful.
#4
Open
yuanpengpeng opened this issue
Jan 10, 2024
· 2 comments
Please tell me why I am reporting a cuda memory error when I perform data preprocessing according to your operation. Please help me. I will be grateful.
(python310Tigre) yuanpeng@DESKTOP-6RR65PH:/mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/scripts$ ./test.sh
Namespace(name='dif-net', epoch=400, dst_list='knee_cbct', split='test', combine='mlp', num_views=10, view_offset=0, out_res=256, eval_npoint=100000, visualize=False)
mixed_dataset: ['knee_cbct']
输出dst_name: knee_cbct
CBCT_dataset, name: knee_cbct, split: test, len: 1.
load ckpt from /mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/scripts/logs/dif-net/ep_400.pth
DIF_Net, mid_ch: 128, combine: mlp
Traceback (most recent call last):
File "/mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/code/evaluate.py", line 120, in
metrics, results = eval_one_epoch(
File "/mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/code/evaluate.py", line 29, in eval_one_epoch
for item in loader:
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in next
data = self._next_data()
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 676, in _next_data
data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 63, in pin_memory
return type(data)({k: pin_memory(sample, device) for k, sample in data.items()}) # type: ignore[call-arg]
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 63, in
return type(data)({k: pin_memory(sample, device) for k, sample in data.items()}) # type: ignore[call-arg]
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 58, in pin_memory
return data.pin_memory(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
The text was updated successfully, but these errors were encountered:
It seems that your machine did not have enough memory (out of memory). To solve it, you may
reduce the number of views (num_views), e.g., set this to 6 to fewer.
use a smaller batch size.
But I am on a 3090ti machine with a video memory size of 24g. I preprocessed the data to a size of 256256256 according to your paper. During the data loading process, the data type has been reduced from float64 to float32, and the spacing is 0.8*0.8. *0.8 size, batch size is set to 1,
Setting num_views to 10 will exceed the video memory limit, and the maximum can only be set to 4. The data set used is the DirLab lung data set, with a total of 10 patients and 10 different phase data for each patient, a total of 100 sets of data.
According to the setting of 10 num views in your paper, 24g should be able to put it down, but it is not good after testing. My WeChat abc01072131, if you can help me solve it, I will be grateful.
Please tell me why I am reporting a cuda memory error when I perform data preprocessing according to your operation. Please help me. I will be grateful.
(python310Tigre) yuanpeng@DESKTOP-6RR65PH:/mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/scripts$ ./test.sh
Namespace(name='dif-net', epoch=400, dst_list='knee_cbct', split='test', combine='mlp', num_views=10, view_offset=0, out_res=256, eval_npoint=100000, visualize=False)
mixed_dataset: ['knee_cbct']
输出dst_name: knee_cbct
CBCT_dataset, name: knee_cbct, split: test, len: 1.
load ckpt from /mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/scripts/logs/dif-net/ep_400.pth
DIF_Net, mid_ch: 128, combine: mlp
Traceback (most recent call last):
File "/mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/code/evaluate.py", line 120, in
metrics, results = eval_one_epoch(
File "/mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/code/evaluate.py", line 29, in eval_one_epoch
for item in loader:
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in next
data = self._next_data()
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 676, in _next_data
data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 63, in pin_memory
return type(data)({k: pin_memory(sample, device) for k, sample in data.items()}) # type: ignore[call-arg]
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 63, in
return type(data)({k: pin_memory(sample, device) for k, sample in data.items()}) # type: ignore[call-arg]
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 58, in pin_memory
return data.pin_memory(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.The text was updated successfully, but these errors were encountered: