Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please tell me why I am reporting a cuda memory error when I perform data preprocessing according to your operation. Please help me. I will be grateful. #4

Open
yuanpengpeng opened this issue Jan 10, 2024 · 2 comments

Comments

@yuanpengpeng
Copy link

Please tell me why I am reporting a cuda memory error when I perform data preprocessing according to your operation. Please help me. I will be grateful.

(python310Tigre) yuanpeng@DESKTOP-6RR65PH:/mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/scripts$ ./test.sh
Namespace(name='dif-net', epoch=400, dst_list='knee_cbct', split='test', combine='mlp', num_views=10, view_offset=0, out_res=256, eval_npoint=100000, visualize=False)
mixed_dataset: ['knee_cbct']
输出dst_name: knee_cbct
CBCT_dataset, name: knee_cbct, split: test, len: 1.
load ckpt from /mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/scripts/logs/dif-net/ep_400.pth
DIF_Net, mid_ch: 128, combine: mlp
Traceback (most recent call last):
File "/mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/code/evaluate.py", line 120, in
metrics, results = eval_one_epoch(
File "/mnt/d/谷歌下载/DIF-Net-main/DIF-Net-main/code/evaluate.py", line 29, in eval_one_epoch
for item in loader:
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in next
data = self._next_data()
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 676, in _next_data
data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 63, in pin_memory
return type(data)({k: pin_memory(sample, device) for k, sample in data.items()}) # type: ignore[call-arg]
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 63, in
return type(data)({k: pin_memory(sample, device) for k, sample in data.items()}) # type: ignore[call-arg]
File "/home/yuanpeng/conda/envs/python310Tigre/lib/python3.10/site-packages/torch/utils/data/_utils/pin_memory.py", line 58, in pin_memory
return data.pin_memory(device)
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

@lyqun
Copy link
Collaborator

lyqun commented Jan 15, 2024

It seems that your machine did not have enough memory (out of memory). To solve it, you may

  1. reduce the number of views (num_views), e.g., set this to 6 to fewer.
  2. use a smaller batch size.

@yuanpengpeng
Copy link
Author

It seems that your machine did not have enough memory (out of memory). To solve it, you may

  1. reduce the number of views (num_views), e.g., set this to 6 to fewer.
  2. use a smaller batch size.

But I am on a 3090ti machine with a video memory size of 24g. I preprocessed the data to a size of 256256256 according to your paper. During the data loading process, the data type has been reduced from float64 to float32, and the spacing is 0.8*0.8. *0.8 size, batch size is set to 1,
Setting num_views to 10 will exceed the video memory limit, and the maximum can only be set to 4. The data set used is the DirLab lung data set, with a total of 10 patients and 10 different phase data for each patient, a total of 100 sets of data.
According to the setting of 10 num views in your paper, 24g should be able to put it down, but it is not good after testing. My WeChat abc01072131, if you can help me solve it, I will be grateful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants