ScrapeIt is a web scraper app that allows you to extract text or HTML content from web pages by providing a URL.
- React: Frontend framework for building user interfaces.
- Auth0: Authentication and authorization management.
- Axios: HTTP client for making API requests.
- Tailwind: For styling the components.
- DOMPurify: Library for sanitizing and preventing XSS vulnerabilities.
- html2pdf.js: Library for generating PDFs from HTML content.
- API Ninjas: Third-party API used for web scraping.
- Node.js (https://nodejs.org/) installed on your machine.
- Clone the repository:
git clone https://github.com/singodiyashubham87/ScrapeIt.git
cd ScrapeIt
- Install dependencies:
npm install
- Edit the .env file like this and add your Auth0 Credentials and API Ninjas API key in the placeholders:
VITE_AUTH0_DOMAIN="AUTH0_DOMAIN"
VITE_AUTH0_CLIENT_ID="AUTH0_CLIENT_ID"
VITE_AUTH0_REDIRECT_URL="http://localhost:5173"
VITE_API_NINJAS_X_API_KEY="API_NINJAS_X_API_KEY"
- Start the app:
npm run dev
- Log in or log out using Auth0 authentication.
- Enter a URL to scrape and choose between extracting text or HTML content.
- Download scraped content as a PDF.
API Ninjas (https://api.api-ninjas.com/): For providing the web scraping API.
This project is licensed under the MIT License.
Support the project by starring the repository.