-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: cloudpickle appears to incorrectly unpickle cloned combiners #26209
Comments
Test failure looks like the following:
|
This test runs on direct runner (bundle based and Portable/FnAPI direct runners) , but the failure can also be reproduced with a TestDataflowRunner on the counterpart ValidatesRunner test (via smth like: |
The failure is not happening if I manually modify the graph rewriting portions responsible for combiner lifting and disable combiner lifting, see:
@AnandInguva also mentioned the error is not reproducible if with_fanout is disabled. with_fanout involves copying here: beam/sdks/python/apache_beam/transforms/core.py Line 2467 in 837733e
|
@tvalentyn I can pick up the investigation from here if you are not working on it. |
The setup calls for these two at bundle_processor share the same I will look into translations and see how these are getting pickled when |
Any update on this? |
@claudevdm will take a look at this. |
This issue should be fixed by #32598 |
What happened?
Combiner lifting and combiner
with_fanout
utility, copy portions of Beam's subgraph related to combiners. It appears that unpickling cloudpickle-pickled bytes encoding those subgraph results in multiple CombineFns sharing the same state, which results in side-effect in combiner setup and teardown initialization.Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: