Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken pipe failures during sampling on MacOS #145

Open
fonnesbeck opened this issue Feb 13, 2024 · 2 comments
Open

Broken pipe failures during sampling on MacOS #145

fonnesbeck opened this issue Feb 13, 2024 · 2 comments

Comments

@fonnesbeck
Copy link
Member

fonnesbeck commented Feb 13, 2024

Describe the bug

When sampling BART models on MacOS, I frequently (but not always) get broken pipe errors, presumably due to multiprocessing, towards the end of sampling runs.

PMB version: 0.5.7
PyMC version: 5.10.3
Python version: 3.10

Additional context

RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py", line 122, in run
    self._start_loop()
  File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py", line 174, in _start_loop
    point, stats = self._step_method.step(self._point)
  File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py", line 231, in step
    point, sts = method.step(point)
  File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py", line 100, in step
    apoint, stats = self.astep(q)
  File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py", line 293, in astep
    self.bart.all_trees.append(self.all_trees)
  File "<string>", line 2, in append
  File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py", line 817, in _callmethod
    conn.send((self._id, methodname, args, kwds))
  File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py", line 211, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py", line 410, in _send_bytes
    self._send(buf)
  File "/Users/cfonnesbeck/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py", line 373, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
"""

The above exception was the direct cause of the following exception:

BrokenPipeError                           Traceback (most recent call last)
File [~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:122](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:122), in run()
    [121](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:121)     self._point = self._make_numpy_refs()
--> [122](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:122)     self._start_loop()
    [123](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:123) except KeyboardInterrupt:

File [~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:174](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:174), in _start_loop()
    [173](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:173) try:
--> [174](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:174)     point, stats = self._step_method.step(self._point)
    [175](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/sampling/parallel.py:175) except SamplingError as e:

File [~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py:231](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py:231), in step()
    [230](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py:230) for method in self.methods:
--> [231](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py:231)     point, sts = method.step(point)
    [232](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/compound.py:232)     stats.extend(sts)

File [~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:100](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:100), in step()
     [98](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:98) q = DictToArrayBijection.map(var_dict)
--> [100](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:100) apoint, stats = self.astep(q)
    [102](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:102) if not isinstance(apoint, RaveledVars):
    [103](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc/step_methods/arraystep.py:103)     # We assume that the mapping has stayed the same

File [~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py:293](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py:293), in astep()
    [292](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py:292) if not self.tune:
--> [293](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py:293)     self.bart.all_trees.append(self.all_trees)
    [295](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/site-packages/pymc_bart/pgbart.py:295) stats = {"variable_inclusion": variable_inclusion, "tune": self.tune}

File <string>:2, in append()

File [~/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py:817](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py:817), in _callmethod()
    [815](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py:815)     conn = self._tls.connection
--> [817](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py:817) conn.send((self._id, methodname, args, kwds))
    [818](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/managers.py:818) kind, result = conn.recv()

File [~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:211](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:211), in send()
    [210](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:210) self._check_writable()
--> [211](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:211) self._send_bytes(_ForkingPickler.dumps(obj))

File [~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:410](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:410), in _send_bytes()
    [409](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:409)     self._send(header)
--> [410](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:410)     self._send(buf)
    [411](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:411) else:
    [412](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:412)     # Issue #20540: concatenate before sending, to avoid delays due
    [413](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:413)     # to Nagle's algorithm on a TCP socket.
    [414](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:414)     # Also note we want to avoid sending a 0-length buffer separately,
    [415](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:415)     # to avoid "broken pipe" errors if the other end closed the pipe.

File [~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:373](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:373), in _send()
    [372](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:372) while True:
--> [373](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:373)     n = write(self._handle, buf)
    [374](https://file+.vscode-resource.vscode-cdn.net/Users/cfonnesbeck/phillies/pie/~/mambaforge/envs/pie/lib/python3.10/multiprocessing/connection.py:374)     remaining -= n

BrokenPipeError: [Errno 32] Broken pipe
@fonnesbeck
Copy link
Member Author

fonnesbeck commented Feb 22, 2024

Note that this occurs even when running single chains, which is odd since there should be no multiprocessing going on. It appears that CompoundStep uses multiprocessing even when there is a single chain.

@fonnesbeck
Copy link
Member Author

Also occurs for Python 3.11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant