New Feature - Append+Dedup Event Stream to create a completely deduplicated event stream of data #46865
williamkaper
started this conversation in
Connector Ideas and Features
Replies: 1 comment
-
I don't have the skill to add a new option to the CDK to try this in Postgres, but if the team would be open to doing it I would volunteer to help review and test it. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Today, there are two options for appending data. You can (incrementally or full) append all, which creates a duplicate on any update event, and you can append-dedup, which basically modifies the record using the PKey, leaving only 1 record per unique PKey based on the latest cursor / airbyte synced record.
What is missing is a way to create a deduplicated event stream, where airbyte basically keeps distinct PKEY + Cursor rows. Rows with the same PKEY + Cursor value would be de-duplicated, leaving the one with the OLDEST airbyte_extracted_at date, preserving a clean and noiseless event stream.
PROS:
CONS:
Beta Was this translation helpful? Give feedback.
All reactions