Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 429 status code. #137

Open
Voyager3D opened this issue Jan 7, 2024 · 2 comments

Comments

@Voyager3D
Copy link

I'm no coder and i've not scraped websites before.
But i'm assuming that this error code might be the website denying me scraping it too much?

I was able to output a file from this website after it scanned 150 pages. Worked perfectly, but somewhere after 150 it does not seem to like it and i get this error:
WARN PlaywrightCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 429 status code.

Not sure if im on the ball with that one or not, but any advice would be appreciated!

Cheers!

@Cougart
Copy link

Cougart commented Jan 16, 2024

Hi,
I'm having the same issue with several websites.
Is it possible to add a sleep option between two calls?
I don't see any other possibilities.
Thanks a lot!

@SimonGodefroid
Copy link

SimonGodefroid commented Feb 12, 2024

429 being "the too many requests" status code you may have been throttled by the server.

Meaning: to prevent people from making too many requests they block requests coming from a given IP either temporarily or permanently after a given amount of incoming requests. Not saying this is 100% your case but that's the most probable scenario here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants