Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ID scraper to use state's new URL #644

Closed
chriszs opened this issue Mar 27, 2024 · 1 comment
Closed

Update ID scraper to use state's new URL #644

chriszs opened this issue Mar 27, 2024 · 1 comment

Comments

@chriszs
Copy link
Contributor

chriszs commented Mar 27, 2024

Idaho moved its warn PDF from https://www.labor.idaho.gov/dnn/Portals/0/Publications/WARNNotice.pdf to https://www.labor.idaho.gov/wp-content/uploads/publications/WARNNotice.pdf. The scraper follows this transparently, so there's no breakage, but seems like a good policy to update the URL to reflect the current location.

@chriszs
Copy link
Contributor Author

chriszs commented Mar 27, 2024

One note here: the state's page linking to this file actually links to https://www.labor.idaho.gov/warnnotice/ which does a redirect to the PDF with a note that says, parenthetically, "link is updated as notices are received." That reads to me like the file is updated continuously, but it could also mean they change the link on a semi-regular basis. So, we have a couple options:

  1. retrieve the file at the current URL of the PDF
  2. retain the current behavior and rely on the redirect from the file's old URL
  3. rely on the /warnnotice/ redirect
  4. scrape the HTML page to know which URL to check

I think it's probably a crap shoot, but the simplest thing to do to improve the situation might be #1.

chriszs added a commit to chriszs/warn-scraper that referenced this issue Mar 27, 2024
@chriszs chriszs changed the title "Fix" ID by updating to state's new URL Update ID to state's new URL Jun 18, 2024
@chriszs chriszs changed the title Update ID to state's new URL Update ID scraper to use state's new URL Jun 18, 2024
@chriszs chriszs closed this as completed Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant