apify · B4nan · May 17, 2024 · May 17, 2024 · May 17, 2024 · May 17, 2024
diff --git a/docs/examples/crawl_sitemap.mdx b/docs/examples/crawl_sitemap.mdx
@@ -12,7 +12,7 @@ import CheerioSource from '!!raw-loader!roa-loader!./crawl_sitemap_cheerio.ts';
 import PuppeteerSource from '!!raw-loader!roa-loader!./crawl_sitemap_puppeteer.ts';
 import PlaywrightSource from '!!raw-loader!roa-loader!./crawl_sitemap_playwright.ts';
 
-This example downloads and crawls the URLs from a sitemap, by using the <ApiLink to="utils/class/Sitemap">`Sitemap`</ApiLink> utility class provided by the <ApiLink to="utils">`@crawlee/utils`</ApiLink> module.
+This example builds a sitemap crawler which downloads and crawls the URLs from a sitemap, by using the <ApiLink to="utils/class/Sitemap">`Sitemap`</ApiLink> utility class provided by the <ApiLink to="utils">`@crawlee/utils`</ApiLink> module.
 
 <Tabs groupId="crawler-type">
 

diff --git a/docs/examples/crawler-plugins/index.mdx b/docs/examples/crawler-plugins/index.mdx
@@ -13,7 +13,7 @@ import PlaywrightExtraSource from '!!raw-loader!roa-loader!./playwright-extra.ts
 [`puppeteer-extra`](https://www.npmjs.com/package/puppeteer-extra) and [`playwright-extra`](https://www.npmjs.com/package/playwright-extra) are community-built
 libraries that bring in a plugin system to enhance the usage of [`puppeteer`](https://www.npmjs.com/package/puppeteer) and
 [`playwright`](https://www.npmjs.com/package/playwright) respectively (bringing in extra functionality, like improving stealth for
-example by using the [`puppeteer-extra-plugin-stealth`](https://www.npmjs.com/package/puppeteer-extra-plugin-stealth) plugin).
+example by using the [`puppeteer-extra-plugin-stealth`](https://www.npmjs.com/package/puppeteer-extra-plugin-stealth) Puppeteer Stealth plugin).
 
 :::tip Available plugins
 
@@ -23,15 +23,15 @@ For [`playwright`](https://www.npmjs.com/package/playwright), please see [`playw
 
 :::
 
-In this example, we'll show you how to use the [`puppeteer-extra-plugin-stealth`](https://www.npmjs.com/package/puppeteer-extra-plugin-stealth) plugin
+In this example, we'll show you how to use the Puppeteer Stealth [(`puppeteer-extra-plugin-stealth`)](https://www.npmjs.com/package/puppeteer-extra-plugin-stealth) plugin
 to help you avoid bot detections when crawling your target website.
 
 <Tabs>
 <TabItem value="puppeteer" label="Puppeteer & puppeteer-extra" default>
 
 :::info Before you begin
 
-Make sure you've installed the `puppeteer-extra` and `puppeteer-extra-plugin-stealth` packages via your preferred package manager
+Make sure you've installed the Puppeteer Extra (`puppeteer-extra`) and Puppeteer Stealth plugin(`puppeteer-extra-plugin-stealth`) packages via your preferred package manager
 
 ```bash
 npm install puppeteer-extra puppeteer-extra-plugin-stealth

diff --git a/docs/examples/http_crawler.mdx b/docs/examples/http_crawler.mdx
@@ -7,7 +7,7 @@ import RunnableCodeBlock from '@site/src/components/RunnableCodeBlock';
 import ApiLink from '@site/src/components/ApiLink';
 import HttpCrawlerSource from '!!raw-loader!roa-loader!./http_crawler.ts';
 
-This example demonstrates how to use <ApiLink to="http-crawler/class/HttpCrawler">`HttpCrawler`</ApiLink> to crawl a list of URLs from an external file, load each URL using a plain HTTP request, and save HTML.
+This example demonstrates how to use <ApiLink to="http-crawler/class/HttpCrawler">`HttpCrawler`</ApiLink> to build a crawler that crawls a list of URLs from an external file, load each URL using a plain HTTP request, and save HTML.
 
 <RunnableCodeBlock className="language-js" type="cheerio">
 	{HttpCrawlerSource}

diff --git a/docs/examples/http_crawler.ts b/docs/examples/http_crawler.ts
@@ -35,8 +35,8 @@ const crawler = new HttpCrawler({
         // Store the results to the dataset. In local configuration,
         // the data will be stored as JSON files in ./storage/datasets/default
         await Dataset.pushData({
-            url: request.url,
-            body,
+            url: request.url, // URL of the page
+            body,  // HTML code of the page
         });
     },
 
@@ -47,6 +47,7 @@ const crawler = new HttpCrawler({
 });
 
 // Run the crawler and wait for it to finish.
+// It will crawl a list of URLs from an external file, load each URL using a plain HTTP request, and save HTML
 await crawler.run([
     'https://crawlee.dev',
 ]);

diff --git a/docs/guides/cheerio_crawler.mdx b/docs/guides/cheerio_crawler.mdx
@@ -11,7 +11,7 @@ import ApiLink from '@site/src/components/ApiLink';
 
 ## What is Cheerio
 
-[Cheerio](https://www.npmjs.com/package/cheerio) is essentially [jQuery](https://jquery.com/) for Node.js. It offers the same API, including the familiar `$` object. You can use it, as you would use jQuery for manipulating the DOM of an HTML page. In crawling, you'll mostly use it to select the needed elements and extract their values - the data you're interested in. But jQuery runs in a browser and attaches directly to the browser's DOM. Where does `cheerio` get its HTML? This is where the `Crawler` part of <ApiLink to="cheerio-crawler/class/CheerioCrawler">`CheerioCrawler`</ApiLink> comes in.
+[Cheerio](https://cheerio.js.org/) is essentially [jQuery](https://jquery.com/) for Node.js. It offers the same API, including the familiar `$` object. You can use it, as you would use jQuery for manipulating the DOM of an HTML page. In crawling, you'll mostly use it to select the needed elements and extract their values - the data you're interested in. But jQuery runs in a browser and attaches directly to the browser's DOM. Where does `cheerio` get its HTML? This is where the `Crawler` part of <ApiLink to="cheerio-crawler/class/CheerioCrawler">`CheerioCrawler`</ApiLink> comes in.
 
 ## How the crawler works
 
@@ -23,7 +23,7 @@ Modern web pages often do not serve all of their content in the first HTML respo
 
 :::
 
-Once the page's HTML is retrieved, the crawler will pass it to [Cheerio](https://www.npmjs.com/package/cheerio) for parsing. The result is the typical `$` function, which should be familiar to jQuery users. You can use the `$` function to do all sorts of lookups and manipulation of the page's HTML, but in scraping, you will mostly use it to find specific HTML elements and extract their data.
+Once the page's HTML is retrieved, the crawler will pass it to [Cheerio](https://github.com/cheeriojs/cheerio) for parsing. The result is the typical `$` function, which should be familiar to jQuery users. You can use the `$` function to do all sorts of lookups and manipulation of the page's HTML, but in scraping, you will mostly use it to find specific HTML elements and extract their data.
 
 Example use of Cheerio and its `$` function in comparison to browser JavaScript:
 
@@ -41,7 +41,7 @@ $('[href]')
 
 :::note
 
-This is not to show that Cheerio is better than plain browser JavaScript. Some might actually prefer the more expressive way plain JS provides. Unfortunately, the browser JavaScript methods are not available in Node.js, so Cheerio is your best bet to do the parsing in Node.
+This is not to show that Cheerio is better than plain browser JavaScript. Some might actually prefer the more expressive way plain JS provides. Unfortunately, the browser JS methods are not available in Node.js, so Cheerio is your best bet to do the parsing in Node.js.
 
 :::
 

diff --git a/website/src/components/Highlights.jsx b/website/src/components/Highlights.jsx
@@ -10,7 +10,7 @@ const FeatureList = [
             <>
                 We believe websites are best scraped in the language they're written in. Crawlee <b>runs on Node.js
                 and it's <a href="https://crawlee.dev/docs/guides/typescript-project">built in TypeScript</a></b> to improve code completion in your IDE,
-                even if you don't use TypeScript yourself.
+                even if you don't use TypeScript yourself. Crawlee supports both TypeScript and JavaScript crawling.
             </>
         ),
     },