Skip to content

How to run crawler in infinite loop? #448

Answered by vdusek
marisancans asked this question in Q&A
Discussion options

You must be logged in to vote

You can run crawler.run() repeatedly in an infinite loop.

import asyncio

from crawlee.beautifulsoup_crawler import BeautifulSoupCrawler, BeautifulSoupCrawlingContext


async def main() -> None:
    crawler = BeautifulSoupCrawler()

    @crawler.router.default_handler
    async def request_handler(context: BeautifulSoupCrawlingContext) -> None:
        context.log.info(f'Processing {context.request.url} ...')

        data = {
            'url': context.request.url,
            'title': context.soup.title.string if context.soup.title else None,
            'h1s': [h1.text for h1 in context.soup.find_all('h1')],
            'h2s': [h2.text for h2 in context.soup.find_all('h2')],
            

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by vdusek
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #438 on August 20, 2024 16:38.