Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Run training.py ch11. CandidateInfo_List assertion error. #114

Open
evilgangsta opened this issue Apr 10, 2024 · 5 comments
Open

Comments

@evilgangsta
Copy link


AssertionError Traceback (most recent call last)
Cell In[7], line 1
----> 1 run('p2ch11.training.LunaTrainingApp', '--epochs=1')

Cell In[2], line 7
4 log.info("Running: {}({!r}).main()".format(app, argv))
6 app_cls = importstr(*app.rsplit('.', 1)) # <2>
----> 7 app_cls(argv).main()
9 log.info("Finished: {}.{!r}).main()".format(app, argv))

File ~/Documents/deeplearningwithpytorch/dlwpt-code-master/p2ch11/training.py:140, in LunaTrainingApp.main(self)
137 def main(self):
138 log.info("Starting {}, {}".format(type(self).name, self.cli_args))
--> 140 train_dl = self.initTrainDl()
141 val_dl = self.initValDl()
143 for epoch_ndx in range(1, self.cli_args.epochs + 1):

File ~/Documents/deeplearningwithpytorch/dlwpt-code-master/p2ch11/training.py:90, in LunaTrainingApp.initTrainDl(self)
89 def initTrainDl(self):
---> 90 train_ds = LunaDataset(
91 val_stride=10,
92 isValSet_bool=False,
93 )
95 batch_size = self.cli_args.batch_size
96 if self.use_cuda:

File ~/Documents/deeplearningwithpytorch/dlwpt-code-master/p2ch11/dsets.py:171, in LunaDataset.init(self, val_stride, isValSet_bool, series_uid, sortby_str)
169 elif val_stride > 0:
170 del self.candidateInfo_list[::val_stride]
--> 171 assert self.candidateInfo_list
173 if sortby_str == 'random':
174 random.shuffle(self.candidateInfo_list)

AssertionError:

Processor - i5-10500H 6 cores
GPU - GTX 1650
Ram - 16Gb

I have downloaded the Luna Dataset on my external hard drive and the code resides on my internal ssd. I am not certain if that is causing the issue. If so how should i change my code? I am using the same code provided in the github repo and nothing has been tampered with (if my memory does not fail me).

@LYK0520
Copy link

LYK0520 commented Apr 18, 2024

i meet the same question

@evilgangsta
Copy link
Author

evilgangsta commented Apr 18, 2024

i meet the same question

I was partially able to solve this by placing the code in the same hard drive as the dataset but the limiting speed of hdd is causing slowdown in caching and training

@LYK0520
Copy link

LYK0520 commented Apr 18, 2024

I have extracted the dataset and placed it in the "data\part2\luna" directory, but I still cannot resolve the aforementioned issue. The error message that appears is as follows:

--> 171 assert self.candidateInfo_list

@LYK0520
Copy link

LYK0520 commented Apr 18, 2024

i solved this, you can run p2ch11.prepcache.LunaPrepCacheApp first to accelerate the train speed. num_workers can be raised if the cpu is not fully occupied

@evilgangsta
Copy link
Author

i solved this, you can run p2ch11.prepcache.LunaPrepCacheApp first to accelerate the train speed. num_workers can be raised if the cpu is not fully occupied

The precache app is just for filling the cache so that 1st epochs don't have to load the data. The assertion error for me was because the dsets.py was not able to find the data located in my external hard drive. The num_workers I guess is equal to the number of cpu cores that the machine has. Also cna you please let me know your caching time and machine specs. Mine is taking awfully long if its more than 5 subsets (More than 14 hours just for caching for 7 subsets ;) ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants