Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip duplicate data during copy from one catalog to another #737

Merged
merged 7 commits into from
May 10, 2024

Conversation

pbeaucage
Copy link
Contributor

@pbeaucage pbeaucage commented May 7, 2024

Following up from comment on #735, this PR provides extremely rudimentary duplicate skipping during a tiled.client.sync.copy. I think it makes sense to put this behavior behind a flag skip_duplicates and leave it False by default. Duplicate handling is a little tricky for performance; I think the scope of .copy() is pretty clear that it doesn't update nodes in the destination catalog if the source has changed (though this, of course, might eventually be desirable behavior).

(edit per @danielballan's suggestion below: this flag is now a str called on_conflict with implemented options 'error' (default), 'warn', and 'skip')

Checklist

  • Add a Changelog entry
  • Add the ticket number which this PR closes to the comment section

@pbeaucage pbeaucage changed the title feature: Duplicate checking during copy from one catalog to another Skip duplicate data during copy from one catalog to another May 7, 2024
@danielballan
Copy link
Member

Just one suggestion to lay track for future extensions. Could we make the parameter on_conflict and make the value a str or StrEnum with values skip and error (default)? In the future we might add warn, overwrite, etc. More discussion is needed, and this can be put off for a future PR, but if we choose an enum instead of a Boolean flag we leave space for extensibility.

@pbeaucage pbeaucage marked this pull request as ready for review May 8, 2024 15:53
@danielballan danielballan requested a review from genematx May 8, 2024 19:20
@danielballan danielballan merged commit 307524d into bluesky:main May 10, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants