Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChunkParser fails with more than 1 worker in Python 3.7 #75

Open
Lutzy opened this issue May 6, 2019 · 0 comments
Open

ChunkParser fails with more than 1 worker in Python 3.7 #75

Lutzy opened this issue May 6, 2019 · 0 comments

Comments

@Lutzy
Copy link

Lutzy commented May 6, 2019

I've been trying to mess around with using AdamWOptimizer instead of MomentumOptimizer using tensorflow 1.13.1, but ChunkParser crashes because in Python 3.7 Process isn't picklable.

Using 10 worker processes.
Traceback (most recent call last):
  File "train.py", line 159, in <module>
    main(argparser.parse_args())
  File "train.py", line 109, in main
    shuffle_size=shuffle_size, sample=SKIP, batch_size=ChunkParser.BATCH_SIZE)
  File "C:\Users\Ryan\PycharmProjects\lczero-training\tf\chunkparser.py", line 96, in __init__
    p.start()
  File "C:\Users\Ryan\Anaconda3\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\Ryan\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\Ryan\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\Ryan\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\Ryan\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle weakref objects
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Ryan\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\Ryan\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

I believe it's related to this: https://bugs.python.org/issue34034

I can workaround it by just forcing workers to be 1, but obviously that's not ideal.

I've never written any Python mp code before, but if the lc0 devs don't want to make changes to this, could you perhaps suggest where I might look at how to best change this to play friendly with 3.7? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant