Skip to content

Extending Ouinet to utilize Webrecorder techniques for crawling and archiving entire websites.

License

Notifications You must be signed in to change notification settings

equalitie/ouicrawl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Ouicrawl

Browser-Based crawling through Ouinet

This repository contains crawling driver for Browsertrix Crawler to allow for crawling through the Ouinet client

Running a crawl

Assuming the Ouinet client is running an HTTP/S proxy on localhost:8077 via Docker host network mode, a crawl of <URL> can be started by running Browsertrix Crawler and specifying the ouinet-crawl.js in this repo as the driver. Other Browsertrix Crawler flags can be added as needed.

docker run --network host -v $PWD/:/config -i -e PROXY_HOST=localhost -e PROXY_PORT=8077 webrecorder/browsertrix-crawler:latest crawl --driver /config/ouinet-crawl.js --url <URL>

About

Extending Ouinet to utilize Webrecorder techniques for crawling and archiving entire websites.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published