Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the API doesn't work #961

Open
androidAppMe opened this issue Feb 8, 2023 · 10 comments
Open

the API doesn't work #961

androidAppMe opened this issue Feb 8, 2023 · 10 comments

Comments

@androidAppMe
Copy link

Hi all. I was using newspaper3k and it was working fine, but today it stopped working and returns empty text. Does anyone have any ideas?

@cattydev
Copy link

me too having this problem, any fixes?

@GalKaplun
Copy link

same here

@cattydev
Copy link

i figured out that the api stopped working on google rss article links
it was working until 2nd february

@banagale
Copy link

banagale commented Feb 14, 2023 via email

@cattydev
Copy link

cattydev commented Feb 14, 2023

Would you please provide a sample implementation or link to a Google rss feed that is now broken?So the error can be reproduced. On Feb 14, 2023, at 7:56 AM, Rıdvan @.> wrote: i figured out that the api stopped working on google rss article links it was working until 2nd february —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.>

this google url, which will redirect to original article link, is taken from google news rss.
for example, i scrape (getContent) that google url via newspaper and get a top image which is og:image of the google url which will redirect to original article link: top image

@androidAppMe
Copy link
Author

Now I'm having a problem with "trafilatura" API as well. Can't get the body of the news with trafilatura as well, which was working finde before!

@cattydev
Copy link

Now I'm having a problem with "trafilatura" API as well. Can't get the body of the news with trafilatura as well, which was working finde before!

are you using google rss too?

@androidAppMe
Copy link
Author

Yes, I'm extracting the rss by pygooglenews. but I can't parse it. Could anybody find a solution? I tried to getting the news directly form google news page but it keeps blocking my IP.

@cattydev
Copy link

i didnt have issue with parsing google news, it was about google's redirect to original page. i solved the problem adding this before using newspaper's getcontent function:

import requests
import time
r = requests.get("google news url taken from google rss")
time.sleep(1)
#r.url is redirected url

@huksley
Copy link

huksley commented Mar 13, 2023

Here you can decode Google RSS urls without have a round pack to the google (https://gist.github.com/huksley/bc3cb046157a99cd9d1517b32f91a99e)

sorry but it is in javascript

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants