Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⛽ Harvesting 2.0 Planning + CKAN Harvesting Documentation #4411

Closed
4 of 5 tasks
nickumia-reisys opened this issue Aug 2, 2023 · 5 comments
Closed
4 of 5 tasks

⛽ Harvesting 2.0 Planning + CKAN Harvesting Documentation #4411

nickumia-reisys opened this issue Aug 2, 2023 · 5 comments

Comments

@nickumia-reisys
Copy link
Contributor

nickumia-reisys commented Aug 2, 2023

User Story

In order to get an idea on where harvesting should go, the data.gov Harvesting team wants to track everything that's happened related to harvesting in our codebases and formalize that information into a coherent review document that can: (1) Improve maintainability of CKAN and (2) Drive harvesting 2.0 planning.

Acceptance Criteria

  • GIVEN research has been completed
    WHEN I look at this ticket
    THEN there is documentation about harvesting.

Background

See proposal document.

Security Considerations (required)

Unknown yet. Deferred to future children tickets.

Sketch

  • Fill out documentation.
  • Discuss key/relevant/required features/config/processes.
  • Revise documentation.
  • Create a formal Wiki or README somewhere.
@Jin-Sun-tts
Copy link
Contributor

Jin-Sun-tts commented Aug 4, 2023

Ckan Plugin documentation :

  • ckanext-datagovtheme - done
  • ckanext-geodatagov - done
  • ckanext-harvest - done
  • ckanext-spatial - done
  • ckanext-dcat - done

@nickumia-reisys
Copy link
Contributor Author

nickumia-reisys commented Aug 8, 2023

The work for this ticket is spread between three documents:

  1. Proposal (mentioned in ticket)
  2. Current System Details (mentioned in ticket and @Jin-Sun-tts's comment)
  3. Harvesting Requirements

@Jin-Sun-tts Jin-Sun-tts moved this from 🏗 In Progress [8] to 📟 Sprint Backlog [7] in data.gov team board Aug 11, 2023
@Jin-Sun-tts Jin-Sun-tts moved this from 📟 Sprint Backlog [7] to 🏗 In Progress [8] in data.gov team board Aug 11, 2023
@jbrown-xentity
Copy link
Contributor

I left a bunch of comments in the harvesting requirements doc. Good work overall! It's very detailed, though not always equally distributed (some things might take minutes, others weeks or months). @nickumia-reisys

@nickumia-reisys
Copy link
Contributor Author

nickumia-reisys commented Aug 25, 2023

This issue can soon be safely marked as completed. A big kudos to @Jin-Sun-tts for staying focused and tunneling through the complete mess that CKAN is. There is a lot of ongoing work and most of this ticket was explorative in nature in how to begin the long-term documentation and planning for the future of Data.gov harvesting. Key points to highlight:

  • A living document that compiles all of the existing CKAN extensions/plugins was given a complete first pass.
    • While thorough and exhaustive, this document is a bit unwieldy to be a primary reference.
    • An aggregation of the code into flowcharts and/or data transformations and/or pipelines should be completed (see below).
    • As more detail is needed from our "old" system, this document should be updated accordingly.
  • A first pass was done at top-level harvesting requirements pulling in information, experience and inspiration from: (1) Existing Code + Documentation, (2) Historical Data.gov Harvesting Tickets, (3) Open Data Laws, (4) Previous Planning Work
    • There are a ton of open questions surrounding the specific details that the Data.gov team will accept as its baseline. These questions should be answered in multiple upcoming meetings. Before these meetings, the following should occur:
  • Once we complete the discussion about top-level harvesting requirements, I believe we'll be in a good position to properly plan the next steps in this saga ♠️

@nickumia-reisys nickumia-reisys moved this from 🏗 In Progress [8] to 👀 Needs Review [2] in data.gov team board Aug 26, 2023
@nickumia-reisys nickumia-reisys moved this from 👀 Needs Review [2] to ✔ Done in data.gov team board Sep 28, 2023
@nickumia-reisys
Copy link
Contributor Author

Having completed #4433, we have a more complete picture of what needs to be done and how to act upon it. We still have to finish the initial review the harvesting requirements and answer open design questions; however, that is an ongoing discussion and will be iterative as we continue.

@hkdctol hkdctol added the H2.0/Harvest-General General Harvesting 2.0 Issues label Sep 29, 2023
@hkdctol hkdctol moved this from ✔ Done to 🗄 Closed in data.gov team board Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

4 participants