Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlink #373

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Deadlink #373

wants to merge 8 commits into from

Conversation

struberg
Copy link

This is a sample project in form of an app which detects dead links on a page.
See /crawler4j-examples/deadlinksniffer.

There is surely room for improvement but it already works reasonably well.

I'd be happy to continue working on it for now, this PR is just for gathering feedback in case I did something completely wrong in the code.

* also add pdf to the ignored files list
* fix logging setup
Previously only the first link which leads to an error page got reported.
This change will keep track of those links and also write a report
if the same link is used on a subsequent page.

We now also write subfolders per domain.
@Chaiavi
Copy link
Contributor

Chaiavi commented Jan 19, 2020

Wow, Thank you @struberg

I didn't review it all, but it seems really impressive.

I'd love to see it merged into the example code in this project.
It could also be a standalone project with this one as a library

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants