Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git scrape Product and Group information for all agencies #18

Closed
8 tasks
thekaveman opened this issue Feb 22, 2024 · 4 comments · Fixed by #19
Closed
8 tasks

Git scrape Product and Group information for all agencies #18

thekaveman opened this issue Feb 22, 2024 · 4 comments · Fixed by #19
Assignees
Labels
actions Related to GitHub Actions workflows cli New command line interface feature implementation or refactor

Comments

@thekaveman
Copy link
Member

thekaveman commented Feb 22, 2024

We would like to understand changes over time to the Littlepay Product and Group details for each agency we work with. This has come up recently as we learned about a change to a Product ID after that Product was duplicated to create a new Product -- we didn't expect this to create a TWO new Products 🤷

So we'd like to capture the output of our littlepay groups and littlepay products CLI commands, in CSV files stored in this repository, to be able to assess those changes over time more easily.

Acceptance Criteria

  • For each agency in production (MST, SacRT, SBMTD as of opening this issue)
  • On some schedule (at least once per day)
  • Run littlepay groups and littlepay products
  • Store the data in a corresponding CSV file in this repository: data/[agency slug]_[groups|products].csv
  • With column headers matching the fields in the corresponding GroupResponse and ProductResponse objects
  • Also store the output of littlepay groups products in a CSV file in this repository: data/[agency slug]_linked_groups_products.csv
  • With column headers group_id, product_id
  • A new commit is made each time this process runs, for each agency, when any data file is changed

Additional context

Sort of related to the work in cal-itp/benefits#1889, can be implemented in workflow in a similar fashion.

More about the git scraping technique.

@thekaveman thekaveman added the actions Related to GitHub Actions workflows label Feb 22, 2024
@thekaveman
Copy link
Member Author

cc @indexing @o-ram, here's the ticket write up for collecting Littlepay Group and Product info over time in this littlepay repository.

Question for you both: how frequently and when should this process run?

@thekaveman thekaveman self-assigned this Feb 23, 2024
@thekaveman thekaveman added the cli New command line interface feature implementation or refactor label Feb 23, 2024
@thekaveman
Copy link
Member Author

Going with once per day in the evening (roughly 10-11pm Pacific) for now -- this will capture any changes made throughout the day, and we'll note the timestamp of when we capture the changes to hopefully make later analysis a little easier.

@indexing
Copy link
Member

@thekaveman The time and frequency sound good to me; seems like a good match for the infrequnecy with which products change.

@o-ram
Copy link
Member

o-ram commented Feb 23, 2024

@thekaveman also sounds good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
actions Related to GitHub Actions workflows cli New command line interface feature implementation or refactor
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants