Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability to scrape javascript intensive apps #557

Closed
AndyTheFactory opened this issue Oct 24, 2023 · 2 comments
Closed

Added ability to scrape javascript intensive apps #557

AndyTheFactory opened this issue Oct 24, 2023 · 2 comments
Labels
bug Something isn't working PR-verify Has a PR, must be checked
Milestone

Comments

@AndyTheFactory
Copy link
Owner

Issue by Sosshi
Sun Jun 12 21:43:26 2022
Originally opened as codelucas/newspaper#941


The library was failing to scrape sites which have javascript code in it so i have added the ability to scrape such websites. So now it will be possible to scrape sites made with Vue, React and other JS intensive frameworks


Sosshi included the following code: https://github.com/codelucas/newspaper/pull/941/commits

@AndyTheFactory
Copy link
Owner Author

Comment by banagale
Thu Jun 16 07:03:59 2022


This sounds compelling. I noticed your change sent includes conversion of single to double quotes and some formatting.

It would be easier to review these changes if it were limited only to materially changed lines.

While I do not believe the maintainer is approving PRs at this time, in general I'd suggest offering a PR with changes that only include what you're working on. Then consider a second that affects formatting in a more general sense.

--

All that said, I'm curious if you have test cases of sites that show article content using JS that fail using the main branch but pass using your change set.

@AndyTheFactory AndyTheFactory added bug Something isn't working PR-verify Has a PR, must be checked labels Oct 25, 2023
@AndyTheFactory AndyTheFactory added this to the First release milestone Oct 25, 2023
@AndyTheFactory
Copy link
Owner Author

Not planning to add a browser component for now

@AndyTheFactory AndyTheFactory closed this as not planned Won't fix, can't repro, duplicate, stale Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working PR-verify Has a PR, must be checked
Projects
None yet
Development

No branches or pull requests

1 participant