Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job Center data quality script(s) #178

Open
zstumgoren opened this issue Jul 4, 2021 · 1 comment
Open

Job Center data quality script(s) #178

zstumgoren opened this issue Jul 4, 2021 · 1 comment
Labels
data quality Bullet-proofing the data

Comments

@zstumgoren
Copy link
Member

zstumgoren commented Jul 4, 2021

See the Job Center docs for background on the scraping strategy and issues described below.

After cutting over to use the Job Center site class for AZ, DE, KS and OK (#126), we should create one or more scripts that can be run on an automated schedule that:

  1. Check for records in each state that are missing Notice Date values (these records are not captured by the new date-based scraping strategy)
  2. Check for the addition of historical data from years prior to the hard-coded stop_year in each state's scrape function

An example of a record without a Notice Date is Eaton in Kansas:

@zstumgoren zstumgoren added the data quality Bullet-proofing the data label Jul 4, 2021
@palewire palewire modified the milestones: Strengthen the Job Center scrape, Luis and Maria's Fix-It Shop, Scraper repair shop, Streamlining the system Jan 20, 2022
@chriszs
Copy link
Contributor

chriszs commented Mar 29, 2024

Possibly related #598

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data quality Bullet-proofing the data
Projects
None yet
Development

No branches or pull requests

3 participants