Skip to content

Running different requests with different crawlers? #573

Answered by janbuchar
honzajavorek asked this question in Q&A
Discussion options

You must be logged in to vote

Well, the best approach to this would be to have separate RequestQueue instances for the separate crawlers and to add requests directly to the queue of the right crawler in your request handlers. There are however some challenges:

  • As of now, you need to make named queues, because Request.open() will always resolve to the same unnamed queue. This may or may not be a problem if you're running on Apify. Locally, you'll probably need to purge the named queues manually before each run.
  • Just waiting for both crawlers using something like await asyncio.gather(crawler_1.run(), crawler_2.run()) also won't work right off the bat - I assume that only one of your crawlers will have some start urls a…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@honzajavorek
Comment options

@janbuchar
Comment options

Answer selected by honzajavorek
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants