Web_Scraping

amazon.ae

In this project, I used scrapy to scrape amazon webpages detailing laptop information.

A crawler is run which first scrapes the original index page from which the laptop descriptions and prices are saved.
Then each laptop's href is followed to get the extended description of each laptop.
All the info is saved to a pandas dataframe that is stored locally.

books.toscrape.com

Similarly in this part, I've used scrapy's crawler to extract book info from the web-scraping practice website https://books.toscrape.com/index.html

The book titles and prices are extracted from the main index page along with each book's url
Each book's main page is followed by the crawler using the href and the book description is extracted and appended to a list
All the lists (titles, prices, urls, descriptions) are added as columns to a pandas dataframe which is further stored as a csv locally