Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Might want to use package_patch instead of package_update #3

Open
canon-cmre-kym-eden opened this issue Jul 26, 2024 · 1 comment
Open

Comments

@canon-cmre-kym-eden
Copy link

I had unexpected loss of all resources from a project while building an ingestion pipeline that could update the dataset metadata. Turns out there is a patch variant of package and resource update: https://docs.ckan.org/en/2.9/api/#ckan.logic.action.update.package_update

Might be worth changing the update to be non-destructive, or introduce a multi-state (enum) parameter:

action = "package_" + (
"update" if pkg and self.options.get("update_existing") else "create"
)

action = "resource_" + (
"update"
if prefer_update and self.options.get("update_existing")
else "create"
)

Thanks

@smotornyuk
Copy link
Member

Yep, great idea. I always thought about ingested packages as something uncontrolled that comes from the outside world.

But a combination of local data with ingested information definitely a common usecase, so patch instead of update is a must-have feature at least:

  • for your scenario, when ingestion should update specific field, and keep anything that already exists in dataset and missing from ingestion info
  • situation, when user adds resources to the ingested package and don't won't this resources to be removed after re-ingestion.

The current behavior with unconditional reset to some specific state of the dataset is also sensible in certain scenarios(in my projects, at least:)), so I'll keep this option.

I'll probably add the following change for now:

action = "package_" + ( 
   "update" if pkg and self.options.get("update_existing") else "create" 
)
# will be changed to => 
update_strategy = self.options.get("update_strategy", "update")

action = "package_" + ( 
   update_strategy if pkg and self.options.get("update_existing") else "create" 
)

To keep it compatible with existing code. Everything remains unchanged by default, and you can set update_strategy: "patch" record option to use package/resource patch

For the v2 of the extension, I'll think of a better interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants