Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(lactapp_1): dynamic backscraper #1210

Merged
merged 3 commits into from
Oct 21, 2024
Merged

Conversation

grossir
Copy link
Contributor

@grossir grossir commented Oct 16, 2024

Will help solve #1194

  • create utils method backscrape_over_paginated_results, abstracted from nd
  • refactor nd to use it
  • refactor lactapp_1 to use it

Will help solve #1194

- create utils method `backscrape_over_paginated_results`, abstracted from nd
- refactor `nd` to use it
- refactor `lactapp_1` to use it
@grossir
Copy link
Contributor Author

grossir commented Oct 16, 2024

You can test that both are still working by doing:

python sample_caller.py -c juriscraper.opinions.united_states.state.lactapp_1 --backscrape --backscrape-start=2024/05/23 --backscrape-end=2024/06/23 -v
python sample_caller.py -c juriscraper.opinions.united_states.state.nd --backscrape --backscrape-start=2024/05/23 --backscrape-end=2024/06/23 -v

@grossir grossir requested a review from flooie October 16, 2024 22:08
@flooie
Copy link
Contributor

flooie commented Oct 18, 2024

why not just use each date instead of paginating?

@grossir
Copy link
Contributor Author

grossir commented Oct 18, 2024

using each date means a request per each day in the interval, since we don't know exactly when an opinion was published; a single pagination request covers many days. I think that except for very small intervals, the pagination will make less requests

Copy link
Contributor

@flooie flooie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

@flooie flooie merged commit 96acad0 into main Oct 21, 2024
12 checks passed
@flooie flooie deleted the lactapp_1_dynamic_backscraper branch October 21, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants