-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Tupletag not found in this PCollectionTuple when running locally. #27753
Comments
After some refactoring I have run into some weird behavior.
The above prints,
I have looked into the TupleTagList source and I don't know why it would be behaving like this. By all means it should be adding to it's inner List with .and() and should be returning that list with getAll(). It should be behaving like the ArrayList I created for testing purposes. |
Okay one last paste dump to clarify the issue and hopefully help other people googling around for solutions as well. From what I have discovered, there is strange behavior when trying to use a TupleTagList.and(TupleTag) in a loop, (I have tested various different loops.) My bandaid solution is to create the TupleTags I need one at a time which sucks because now my code will have to change in multiple places if I need more TupleTags of my types in the future. Paste of code,
Prints,
Apologies for being hard to read. But the gist of it is, there are no tags on the TupleTagList, (Tags 2: [],) after looping over all the values of my MessageType enum. But the loop did iterate properly hence why the map is populated correctly, ([CURVE_INGESTION=Tag<com.example.App$6.:126#8ce970b71df42503>, TEST=Tag<com.example.App$6.:126#554dcd2b40b5f04b>].) I can still however add to the TupleTagList and print the contents as you can see when the line,
|
Nevermind, I'm an idiot, .and() returns an immutable list, to I needed to do, |
What happened?
First of all let me pre-face this with, I am still new to Apache Beam so apologies if I made a simple mistake or am misunderstanding something.
We are using Beam to create a generic data streaming pipeline that reads from a messaging service and process the data depending on configurations present in the message. I believe this means at some point the Transformations will need to return a PCollectionTuple to break the data into their relevant PCollections depending on the different data types. However when attempting to access the returned PCollections using their tuple-tags in the app only the default PCollection/TupleTag is present.
Main App
MessageParser
PipelineType
MessageParserFn
I suspect it's failing because I am attempting to interact with the PCollectionTuple outside of an execution of the pipeline transforms. But reading into the documentation here,
https://beam.apache.org/documentation/pipelines/design-your-pipeline/#a-single-transform-that-produces-multiple-outputs
the provided example seems to do exactly what I want to do and gets the PCollections post tagged output. What am I misunderstanding about multi-outputs? Is there a bug? Is the documentation wrong/unclear?
I can provide more source if required but I am hoping it will be something obvious I am doing wrong
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: