From 293c11c5d150232ddc14827e5a37d7b0430068a2 Mon Sep 17 00:00:00 2001 From: Manu Vasconcelos <87862340+vasconsaurus@users.noreply.github.com> Date: Mon, 14 Aug 2023 16:57:43 -0300 Subject: [PATCH] 3309 - Update pender to use twitter's v2 API (#371) * add a TwitterClient class to deal with requests to the v2 api - I used Rack::Utils.build_nested_query(params) instead of Rails' to_query because the first one keeps the order of hash when it converts to the query. I think that will make it easier to read when debugging. - We have a provider_instagram that also does the api call, it's very similar to what we were doing in the class. So I followed the structure we used for that provider, making it a Concern, instead of adding a new Class. - The url the new api returns is the url linked in the profile, not the twitter url. So for now I'm only combining the twitter url with the username to get the author url. * move twitter parser integration test to a dedicated folder - We want to separate the integration tests from the unit tests, so that in the future it can be easier for us to choose when we want to run them. We want to do this to all parsers * update twitter item tests - We don't need that many integration tests, so we are moving some of them back to test/models/parser/twitter_item_test.rb, and updating them to not make live requests, instead they will be stubbed. - I added tests to check the basic request functionality that we now have. And removed "assigns values to hash from the API response" since that is already being tested in "it makes a get request to the tweet lookup endpoint successfully" - "should decode html entities" was removed because that happens inside Media and is not done by the individual parser, which means the test actually fails (as it should) - fake_tweet and fake_twitter_user were removed, since they used methods from the old Twitter gem. Now we are stubbing a response from our new method: tweet_lookup - added .squish to parsed_data['raw']['api']['data'][0]['text'] to clean up line breaks from title and description. Our test was failling because it was not being removed. also since title and description are the same, I just set the description to be the same as the title instead of parsing twice. - removed the test for truncated text, that behavior is no longer present in the v2 api, only retweets might be truncated (we don't fetch those), and the way to deal with it is different. It does not take truncated as a query param. - removed storing oembed test because that happens inside Media and not the twitter profile parser - removed old error handling behavior tests and added new ones * remove twitter spec I think this was relying on the twitter gem error handling, so I don't think it makes sense to keep this for now. * update twitter config on config.example * remove twitter gem * update archiver_worker_test now they work with the twitter links, but since twitter is a bit unstable regarding changes, we should probably avoid using twitter links where it isn't absolutely needed * update according to Christa's review main notes: - instead of re-raising the error inside the provider, we are notifying sentry and returning an errors hash. https://github.com/meedan/pender/pull/371#discussion_r1290459001 - we re-wrote the parsers according to what we feel is 'safer' moving forward: using merge! to set the defaults. - I had to update some of page_item_tests, so I used this as an opportunity to move the integration tests to their own file. (we are working on moving all the integration tests to their own files, separated from the unit ones) - Updated the error tests, now we test 3 scenarios: 200 response with error in json, non-200 response, exception in Net::HTTP - Updated to use dig, ie. parsed_data.dig('raw','api','data',0). this will return nil if data is missing, before it would raise an error for invalid access * update test shouldn't error when cannot get twitter author url when we get_twitter_metadata in the base parser, we check if there is a twitter username, if there is an username, we get the twitter author_url. if there isn't an username, there won't be an author_url, it shouldn't error when that happens * update errors hash and error testing I think it's better if we return the same keys inside errors as the twitter api, it will make it easier to test. * update how we deal with picture inside twitter item if there is no picture, it should be an empty string but it doesn't always make sense to test for it's presence, because if we get an error, it will be set to a string inside Media and not the parser --------- Co-authored-by: Caio Almeida <117518+caiosba@users.noreply.github.com> --- .travis.yml | 2 +- Gemfile | 1 - Gemfile.lock | 39 --- app/models/concerns/provider_twitter.rb | 57 +++- app/models/parser/base.rb | 17 +- app/models/parser/twitter_item.rb | 76 +++-- app/models/parser/twitter_profile.rb | 51 +-- config/config.yml.enc | Bin 2672 -> 2800 bytes config/config.yml.example | 5 +- spec/requests/api/medias_spec.rb | 43 --- test/controllers/medias_controller_test.rb | 148 ++------- test/data/twitter-item-response-error.json | 13 + test/data/twitter-item-response-success.json | 36 +++ test/data/twitter-item-response.json | 107 ------- test/data/twitter-profile-response-error.json | 13 + .../twitter-profile-response-success.json | 13 + test/data/twitter-profile-response.json | 113 ------- test/integration/parsers/page_item_test.rb | 129 ++++++++ test/integration/parsers/twitter_item_test.rb | 25 ++ .../parsers/twitter_profile_test.rb | 29 ++ test/models/parser/page_item_test.rb | 174 +---------- test/models/parser/twitter_item_test.rb | 293 ++++++------------ test/models/parser/twitter_profile_test.rb | 192 ++++++++---- test/workers/archiver_worker_test.rb | 9 +- 24 files changed, 654 insertions(+), 931 deletions(-) create mode 100644 test/data/twitter-item-response-error.json create mode 100644 test/data/twitter-item-response-success.json delete mode 100644 test/data/twitter-item-response.json create mode 100644 test/data/twitter-profile-response-error.json create mode 100644 test/data/twitter-profile-response-success.json delete mode 100644 test/data/twitter-profile-response.json create mode 100644 test/integration/parsers/page_item_test.rb create mode 100644 test/integration/parsers/twitter_item_test.rb create mode 100644 test/integration/parsers/twitter_profile_test.rb diff --git a/.travis.yml b/.travis.yml index 181d860b..5c9c449c 100644 --- a/.travis.yml +++ b/.travis.yml @@ -1,6 +1,6 @@ language: minimal before_install: -- openssl aes-256-cbc -K $encrypted_8cec9149bf7a_key -iv $encrypted_8cec9149bf7a_iv -in config/config.yml.enc -out config/config.yml -d +- openssl aes-256-cbc -K $encrypted_3491736328a2_key -iv $encrypted_3491736328a2_iv -in config/config.yml.enc -out config/config.yml -d - cp config/database.yml.example config/database.yml - cp config/sidekiq.yml.example config/sidekiq.yml - echo "$DOCKER_PASSWORD" | docker login -u "$DOCKER_USERNAME" --password-stdin diff --git a/Gemfile b/Gemfile index ae5e3363..426b8985 100644 --- a/Gemfile +++ b/Gemfile @@ -39,7 +39,6 @@ gem 'yt', '~> 0.25.5' gem 'rswag-api' gem 'rswag-ui' gem 'sass-rails' -gem 'twitter' gem 'open_uri_redirections', require: false gem 'postrank-uri', git: 'https://github.com/postrank-labs/postrank-uri.git', ref: '485ac46', require: false # Ruby 3.0 support, as of 2/6/23 no gem relaease gem 'retryable' diff --git a/Gemfile.lock b/Gemfile.lock index e5ee732f..a11ec6eb 100644 --- a/Gemfile.lock +++ b/Gemfile.lock @@ -102,7 +102,6 @@ GEM aws-eventstream (~> 1, >= 1.0.2) benchmark-ips (2.12.0) bindex (0.8.1) - buftok (0.3.0) builder (3.2.4) byebug (11.1.3) codeclimate-test-reporter (1.0.8) @@ -130,14 +129,8 @@ GEM thor (>= 0.19, < 2) diff-lcs (1.5.0) docile (1.1.5) - domain_name (0.5.20190701) - unf (>= 0.0.5, < 1.0.0) - equalizer (0.0.11) erubi (1.12.0) ffi (1.15.5) - ffi-compiler (1.0.1) - ffi (>= 1.0.0) - rake gem-licenses (0.2.2) get_process_mem (0.2.7) ffi (~> 1.0) @@ -150,23 +143,12 @@ GEM heapy (0.2.0) thor htmlentities (4.3.4) - http (5.1.1) - addressable (~> 2.8) - http-cookie (~> 1.0) - http-form_data (~> 2.2) - llhttp-ffi (~> 0.4.0) - http-cookie (1.0.5) - domain_name (~> 0.5) - http-form_data (2.3.0) i18n (1.14.1) concurrent-ruby (~> 1.0) jmespath (1.6.2) json (2.6.3) json-schema (2.8.1) addressable (>= 2.4) - llhttp-ffi (0.4.0) - ffi-compiler (~> 1.0) - rake (~> 13.0) lograge (0.12.0) actionpack (>= 4) activesupport (>= 4) @@ -184,8 +166,6 @@ GEM net-pop net-smtp marcel (1.0.2) - memoizable (0.4.2) - thread_safe (~> 0.3, >= 0.3.1) memory_profiler (1.0.1) method_source (1.0.0) mini_histogram (0.3.1) @@ -195,8 +175,6 @@ GEM minitest-retry (0.2.2) minitest (>= 5.0) mocha (1.14.0) - multipart-post (2.3.0) - naught (1.1.0) net-http (0.3.2) uri net-imap (0.3.6) @@ -388,7 +366,6 @@ GEM connection_pool (>= 2.2.2) rack (~> 2.0) redis (>= 4.2.0) - simple_oauth (0.3.1) simplecov (0.13.0) docile (~> 1.1.0) json (>= 1.8, < 3) @@ -409,25 +386,10 @@ GEM terminal-table (3.0.2) unicode-display_width (>= 1.1.1, < 3) thor (1.2.2) - thread_safe (0.3.6) tilt (2.2.0) timeout (0.4.0) - twitter (8.0.0) - addressable (~> 2.3) - buftok (~> 0.3.0) - equalizer (~> 0.0.11) - http (~> 5.1) - http-form_data (~> 2.3) - llhttp-ffi (~> 0.4.0) - memoizable (~> 0.4.0) - multipart-post (~> 2.0) - naught (~> 1.0) - simple_oauth (~> 0.3.0) tzinfo (2.0.6) concurrent-ruby (~> 1.0) - unf (0.1.4) - unf_ext - unf_ext (0.0.8.2) unicode-display_width (2.4.2) uri (0.12.2) web-console (3.5.1) @@ -513,7 +475,6 @@ DEPENDENCIES simplecov-console spring sprockets (= 3.7.2) - twitter web-console (~> 3.5.1) webmock yt (~> 0.25.5) diff --git a/app/models/concerns/provider_twitter.rb b/app/models/concerns/provider_twitter.rb index 8b861830..341eb82f 100644 --- a/app/models/concerns/provider_twitter.rb +++ b/app/models/concerns/provider_twitter.rb @@ -3,22 +3,61 @@ module ProviderTwitter extend ActiveSupport::Concern + class ApiError < StandardError; end + + BASE_URI = "https://api.twitter.com/2/" + def oembed_url(_ = nil) "https://publish.twitter.com/oembed?url=#{self.url}" end + def tweet_lookup(tweet_id) + params = { + "ids": tweet_id, + "tweet.fields": "author_id,created_at,text", + "expansions": "author_id,attachments.media_keys", + "user.fields": "profile_image_url,username,url", + "media.fields": "url", + } + + get "tweets", params + end + + def user_lookup_by_username(username) + params = { + "usernames": username, + "user.fields": "profile_image_url,name,username,description,created_at,url", + } + + get "users/by", params + end + private - def handle_twitter_exceptions + def get(path, params) + uri = URI(URI.join(BASE_URI, path)) + uri.query = Rack::Utils.build_query(params) + + http = Net::HTTP.new(uri.host, uri.port) + http.use_ssl = true + + headers = { + "Authorization": "Bearer #{PenderConfig.get('twitter_bearer_token')}", + } + + request = Net::HTTP::Get.new(uri.request_uri, headers) + begin - yield - rescue Twitter::Error::TooManyRequests => e - raise Pender::Exception::ApiLimitReached.new(e.rate_limit.reset_in) - rescue Twitter::Error => error - PenderSentry.notify(error, url: url) - @parsed_data[:raw][:api] = { error: { message: "#{error.class}: #{error.code} #{error.message}", code: Lapis::ErrorCodes::const_get('INVALID_VALUE') }} - Rails.logger.warn level: 'WARN', message: "[Parser] #{error.message}", url: url, code: error.code, error_class: error.class - return + response = http.request(request) + raise ApiError.new("#{response.code} - #{response.message}") unless response.code.to_i < 400 + JSON.parse(response.body) + rescue StandardError => e + PenderSentry.notify(e, url: url, response_body: response&.body) + { 'errors' => [{ + title: "#{e&.class} - #{e&.message}", + detail: response&.body + }] + } end end diff --git a/app/models/parser/base.rb b/app/models/parser/base.rb index cafd1a09..3147a315 100644 --- a/app/models/parser/base.rb +++ b/app/models/parser/base.rb @@ -73,15 +73,6 @@ def parse_data_for_parser(doc, original_url, jsonld_array) raise NotImplementedError.new("Parser subclasses must implement parse_data_for_parser") end - def twitter_client - @twitter_client ||= Twitter::REST::Client.new do |config| - config.consumer_key = PenderConfig.get('twitter_consumer_key') - config.consumer_secret = PenderConfig.get('twitter_consumer_secret') - config.access_token = PenderConfig.get('twitter_access_token') - config.access_token_secret = PenderConfig.get('twitter_access_token_secret') - end - end - def ignore_url?(url) self.ignored_urls.each do |item| if url.match?(item[:pattern]) @@ -166,13 +157,7 @@ def get_twitter_metadata def twitter_author_url(username) return if bad_username?(username) - begin - twitter_client.user(username)&.url&.to_s - rescue Twitter::Error => e - PenderSentry.notify(e, url: url, username: username) - Rails.logger.warn level: 'WARN', message: "[Parser] #{e.message}", username: username, error_class: e.class - nil - end + "https://twitter.com/" + username.gsub("@","") end def bad_username?(value) diff --git a/app/models/parser/twitter_item.rb b/app/models/parser/twitter_item.rb index cd9f94a6..2618970a 100644 --- a/app/models/parser/twitter_item.rb +++ b/app/models/parser/twitter_item.rb @@ -18,48 +18,56 @@ def patterns # Main function for class def parse_data_for_parser(_doc, _original_url, _jsonld_array) - @url.gsub!(/(%23|#)!\//, '') - @url = replace_subdomain_pattern(url) - parts = url.match(TWITTER_ITEM_URL) - user, id = parts['user'], parts['id'] - - @parsed_data['raw']['api'] = {} - handle_twitter_exceptions do - @parsed_data['raw']['api'] = twitter_client.status(id, tweet_mode: 'extended').as_json + handle_exceptions(StandardError) do + @url.gsub!(/(%23|#)!\//, '') + @url.gsub!(/\s/, '') + @url = replace_subdomain_pattern(url) + + parts = url.match(TWITTER_ITEM_URL) + user, id = parts['user'], parts['id'] + + @parsed_data.merge!( + external_id: id, + username: '@' + user, + author_url: get_author_url(user) + ) + + @parsed_data['raw']['api'] = tweet_lookup(id) + @parsed_data[:error] = parsed_data.dig('raw', 'api', 'errors') + + if @parsed_data[:error] + @parsed_data.merge!( + author_name: user, + ) + elsif @parsed_data[:error].nil? + raw_data = parsed_data.dig('raw','api','data',0) + raw_user_data = parsed_data.dig('raw','api','includes','users',0) + + @parsed_data.merge!({ + picture: get_twitter_item_picture(parsed_data), + title: raw_data['text'].squish, + description: raw_data['text'].squish, + author_picture: raw_user_data['profile_image_url'].gsub('_normal', ''), + published_at: raw_data['created_at'], + html: html_for_twitter_item(url), + author_name: raw_user_data['name'], + }) + end end - @parsed_data[:error] = parsed_data.dig(:raw, :api, :error) - @parsed_data.merge!({ - external_id: id, - username: '@' + user, - title: stripped_title(parsed_data), - description: parsed_data.dig('raw', 'api', 'text') || parsed_data.dig('raw', 'api', 'full_text'), - picture: picture_url(parsed_data), - author_picture: author_picture_url(parsed_data), - published_at: parsed_data.dig('raw', 'api', 'created_at'), - html: html_for_twitter_item(parsed_data, url), - author_name: parsed_data.dig('raw', 'api', 'user', 'name'), - author_url: twitter_author_url(user) || RequestHelper.top_url(url) - }) parsed_data end - def stripped_title(data) - title = (data.dig('raw', 'api', 'text') || data.dig('raw', 'api', 'full_text')) - title.gsub(/\s+/, ' ') if title + def get_author_url(user) + 'https://twitter.com/' + user end - def author_picture_url(data) - picture_url = data.dig('raw', 'api', 'user', 'profile_image_url_https') - picture_url.gsub('_normal', '') if picture_url + def get_twitter_item_picture(parsed_data) + return unless parsed_data.dig('raw', 'api', 'includes') + item_media = parsed_data.dig('raw', 'api', 'includes', 'media') + item_media ? item_media.dig(0, 'url') : '' end - def picture_url(data) - item_media = data.dig('raw', 'api', 'entities', 'media') - (item_media.dig(0, 'media_url_https') || item_media.dig(0, 'media_url')) if item_media - end - - def html_for_twitter_item(data, url) - return '' unless data.dig(:raw, :api, :error).blank? + def html_for_twitter_item(url) '
' + '' + '' + diff --git a/app/models/parser/twitter_profile.rb b/app/models/parser/twitter_profile.rb index 98fd1076..84ee6a94 100644 --- a/app/models/parser/twitter_profile.rb +++ b/app/models/parser/twitter_profile.rb @@ -9,8 +9,8 @@ def type def patterns [ - /^https?:\/\/(www\.)?twitter\.com\/([^\/]+)$/, - /^https?:\/\/(0|m|mobile)\.twitter\.com\/([^\/]+)$/ + /^https?:\/\/(www\.)?twitter\.com\/(?
S^$;wb zA}-5P8-ap^3&i)9ZxdP)ZWfhGFS%+4QADT7u{H4m?o)4|frzPYX~J_RvY6KCw)vh; zYBN{JhyeDT?EQko&Y#7b^$SOMoq4>gRPfid_@ zDO2`R8e C7Tiy34$H7&!oK$U8btxee$B;9&^YG&x=uN$ zFA>mfD4xPv#fqy5TzuK%D9O2JWQe){MmmRZY&qj%QH9N;5Eye9k&BF>AKKi<2t{Lb z`U1JcPaDg*WRa>tEv~jqUe#w7+=pny0094D=vt~VPEm*j xml zo+<4q^7oZ3Eq%SXqJ}sm6l3nZU?-J!qVv7gya-6~?x&nbB4W=I2OAyN8%Xbj+7o%D z^78>1@i0#&4|v;*hVn9kqiWuK;8n=tQQQRQp+Qm}*-}7~cWp~r`S}dnflV<1-6W3c zJz`dPt@3;%bXr*G)BvsMiwY}(gA|Pwobhe0 eSRK8b_x)bwv)tHvRHI(@r?d4 zi8$nCGC|n2E$aMix6RRQCd*1w?}o_D=!3+x)xjW}{RJkSl~{A13>kA50A(lihWF|# zk$!USGbp;Mr_uJk3%S=dWJd3Zej>q5SQP;98xAcP2|d*_nJS*ZN^?@jyZCAEn0RN{ zcg?WtsokV}#W3Y1h?Ke5`E#ut&OEdRCIg&vb`WPl{ZcjgGWtNAKM$H&-s^c44c9m3 zPZhPK^<;1-#j9J+&z=({>%eCNRX;z~s;^YD5=rgP?Nwf4+LAcY=tpQ@+`d~Lbe`(Y z;kqQ=_Omh&)6c(A!*-E)wW8BfbS4QYh9o=)UAs@R_3*T!1`29ci&~zr8PPGFk&^l6 CKx|C_ literal 2672 zcmV-$3Xk<1Q#|@jIM~nxBhW>A%)Pwr13Zf;txiZwwJ}p6qq?jB@L>CK`!5X41FNO_ z&+}3nd`&y6AZg&^vONuhh^q+bv+@Wi{`uEqISccs=+WWXES%Wp 8z$5eujOb}> z*Jkuj9VGT!CRb8@q%q!;9 8G~Xv1 z6CCT+u&x%`;lDy;4!siqq!mvs+}l-meON4&a`=SAN#_CbF$BR8N7QX|Qzfl$plro9 z6E`@*5I-V>cuc)fpTbPLB*`X4XMIcrQ7uen&{2$6>9F9r{`mL(%`<-2wzJj&Z>Xv7 z^n5#{u&3S9dUgswfdJ*@+t`e}l_)_>f jR`}Cl)MJJW3G8HHqgvc!*q04p3fJwjH7?Z`Zrr&X Y5!YmB*vA<2cLH67Sz&Tvr9p5v z1{qCW6CE%gB`sb1-?@Br*6Y4G>z>tpBh71T{$}TD2u;;a<+b_I!rCT{;F?5F7#nwB z;TTrL^6+24_|$+E>RFX O~$V!T9FM>pA#xI)kZ}oyBGPZ+VLHGG #v;-G0<@yS>ZZ5=7cB; Q#E9F0opd<>|ae0(leAjrA-Z67=0pW=?H)K1pK9Webm<-A^#Tj}4I zre%ecA_;4;9MOEV&=zMD9}#vpuTy~1+ScB>*M?)VcE-AGRk#vqN#wt4ID7NSsVj!{ z*2s%z@o*vPyWM-peh5DV*W$w&QSTup5QS0GZ*WY+w#V`8Y}df9RAXX|Z^ufx2O#la z>8c!WDQ|h=c~dEub#?jjvuNo5T;* r z(lx&`^=Fb^?C$RyPPHm}DnJ#h^cQolTzboSevj%m>uz^g >CGN0?V*8Y9E{in zru5rakE(U|4S(WQAf(R`eJ)R@8R(<8P0BE@58qY8FJcjEk5K0{%t3mHVRp*0 5CIZ}~9I|a*tx)HNF)eBo1v7s)FP|i*5gIqOyO$EdL zb5`DdqE6jr&`2|5`URPHi?vlQ)T+k1JEL}-vWo(p7eE&(xS0P<0nw|Xddf*9In;DT zP~fFf$*opm%4p-b9`L{)0RB^Ur97yED{kBPn$C};&;&4m!_$xdQ7oaReE3@l_S6ny zO|3Z{Aula2XqzT@v^V)C$r(!Vy93ibfqyx0k9_&n_r0oH?{;+MNG>W%ExG#*ic^~% zccm8lP8b6O$Y8K`STP2)+}Re!Zx*fuD6;oEEzGve;&$2`{G*LT*u?YTcX%9AsGS_a zlVZA0xJ#B8k#UXUhu}^b20{5gr~{ta%YWrGfHc4=vhIf%_iclvl`{I897P?c5F*8h zmSdbFxhi=xq!hzJO^U|UUc#79>#S@W-HA$lsi*E7R|Cu`AOuLN)lE79^nlsShSz#d zEH-CqHKmdp7gQAKB3V}!St^HC7%`ZX*QnPQICSXBsf4B(3+Y7??WqWd35 zG0xv-^a7aP8J3UrSi~Z}XC=^Se5s;>m5jx=eFe?qsdThfx*;P@2GSs=(%{Eqa8Hwy zBeIJ!6xpEu_<}`l7jvGp;nxxqM&iC?@EqM69^iLl&L|u-0p&eXQKymp9zps$_0MJn zb|2b&P7yH#DPkKa9)9e7t9$X*#0#oPU-Obx##5*&-C!>j@vGCo`hy9_z9W}1*exbB zm-#AI<8qry4D}S{x+EFzycl(Q<)qTTW6vemRZQj74RnjQsCAXV3DZ}txm}wOw-&xo zTvS77m$1tT)~+tKWB*0Ko}3Pb^6mBHbkU(Hfep9CcOAi!5~+0=v}4N#$~9n^wJC`< zW$Yw9a`JyAooB`|4MR!H$tq!#_t(x}8MfEO!7#B>ASM^gytg7`?_vx&&adnboHzSO zz&yH1mc7O>1fFVSTqmB-&W%?vPt9yXEm``4Fh!(eCjtxp0Yx#lw#f+T{Z|jsGJ_GK z@BPGbK4%g`5|>}IXGCNGML*=A)GmpN2X0ZSiOZiZr-mAdL&eqdKhU{R`@`h)4WkEd zG9&h)-!#+&ZxhW)@*o~hg}Vcf9R7|~DkMaZpI@-V0AJ2!G3Fh3wz ^o zvd2Ids-;^H3#{hPAE|WiF6 0G=WsyeR4OOg-FVS+?VkH-R!8$L;?2`9v>%fvj^3dqQlQ6h?!-``8 Zcy+6xZ30s)tWv@GFM^ diff --git a/config/config.yml.example b/config/config.yml.example index 8d439043..472afb27 100644 --- a/config/config.yml.example +++ b/config/config.yml.example @@ -59,10 +59,7 @@ development: &default # # REQUIRED for Twitter posts # - twitter_consumer_key: # '' - twitter_consumer_secret: # ' ' - twitter_access_token: # ' ' - twitter_access_token_secret: # ' ' + twitter_bearer_token: # ' ' # Facebook API # diff --git a/spec/requests/api/medias_spec.rb b/spec/requests/api/medias_spec.rb index f49ec5b4..d34cb86c 100644 --- a/spec/requests/api/medias_spec.rb +++ b/spec/requests/api/medias_spec.rb @@ -163,49 +163,6 @@ end end - # response '429', 'API limit reached' do - # schema type: :object, - # properties: { - # type: { type: :string }, - # data: { - # type: :object, - # properties: { - # message: { type: :integer }, - # code: { type: :integer } - # }, - # required: [ 'message', 'code' ] - # } - # }, - # required: [ 'type', 'data' ] - - # let(:url) { 'https://twitter.com/anxiaostudio' } - # let(auth_header) { authed } - - # before do |example| - # allow_any_instance_of(Twitter::REST::Client).to receive(:user).and_raise(Twitter::Error::TooManyRequests) - # allow_any_instance_of(Twitter::Error::TooManyRequests).to receive(:rate_limit).and_return(OpenStruct.new(reset_in: 123)) - - # submit_request(example.metadata) - # end - - # it 'should return API limit reached error' do |example| - # pending("twitter api key is not currently working") - # assert_response_matches_metadata(example.metadata) - - # response_body = JSON.parse(response.body) - # expect(response_body).not_to be_nil - # data = response_body['data'] - # expect(data['message']).to eq(123) - # end - - # after do - # allow_any_instance_of(Twitter::REST::Client).to receive(:user).and_call_original - # allow_any_instance_of(Twitter::Error::TooManyRequests).to receive(:rate_limit).and_call_original - # end - - # include_context 'generate examples' - # end - response '409', 'URL already being processed' do let(:url) { 'https://www.youtube.com/user/MeedanTube' } let(auth_header) { authed } diff --git a/test/controllers/medias_controller_test.rb b/test/controllers/medias_controller_test.rb index 508d2abb..3e38cfee 100644 --- a/test/controllers/medias_controller_test.rb +++ b/test/controllers/medias_controller_test.rb @@ -8,41 +8,39 @@ def setup end test "should be able to fetch HTML without token" do - get :index, params: { url: 'http://twitter.com/meedan', format: :html } + get :index, params: { url: 'https://meedan.com/post/annual-report-2022', format: :html } assert_response :success end test "should ask to refresh cache" do - skip("twitter api key is not currently working") authenticate_with_token - get :index, params: { url: 'https://twitter.com/caiosba/status/742779467521773568', refresh: '1', format: :json } + get :index, params: { url: 'https://meedan.com/post/annual-report-2022', refresh: '1', format: :json } first_parsed_at = Time.parse(JSON.parse(@response.body)['data']['parsed_at']).to_i - get :index, params: { url: 'https://twitter.com/caiosba/status/742779467521773568', format: :html } - name = Media.get_id('https://twitter.com/caiosba/status/742779467521773568') + get :index, params: { url: 'https://meedan.com/post/annual-report-2022', format: :html } + name = Media.get_id('https://meedan.com/post/annual-report-2022') [:html, :json].each do |type| assert Pender::Store.current.read(name, type), "#{name}.#{type} is missing" end sleep 1 - get :index, params: { url: 'https://twitter.com/caiosba/status/742779467521773568', refresh: '1', format: :json } + get :index, params: { url: 'https://meedan.com/post/annual-report-2022', refresh: '1', format: :json } assert !Pender::Store.current.read(name, :html), "#{name}.html should not exist" second_parsed_at = Time.parse(JSON.parse(@response.body)['data']['parsed_at']).to_i assert second_parsed_at > first_parsed_at end test "should not ask to refresh cache" do - skip("twitter api key is not currently working") authenticate_with_token - get :index, params: { url: 'https://twitter.com/caiosba/status/742779467521773568', refresh: '0', format: :json } + get :index, params: { url: 'https://meedan.com/post/annual-report-2022', refresh: '0', format: :json } first_parsed_at = Time.parse(JSON.parse(@response.body)['data']['parsed_at']).to_i sleep 1 - get :index, params: { url: 'https://twitter.com/caiosba/status/742779467521773568', format: :json } + get :index, params: { url: 'https://meedan.com/post/annual-report-2022', format: :json } second_parsed_at = Time.parse(JSON.parse(@response.body)['data']['parsed_at']).to_i assert_equal first_parsed_at, second_parsed_at end test "should ask to refresh cache with html format" do authenticate_with_token - url = 'https://twitter.com/gyenesnat/status/1220020473955635200' + url = 'https://meedan.com/post/annual-report-2022' get :index, params: { url: url, refresh: '1', format: :html } id = Media.get_id(url) first_parsed_at = Pender::Store.current.get(id, :html).last_modified @@ -54,7 +52,7 @@ def setup test "should not ask to refresh cache with html format" do authenticate_with_token - url = 'https://twitter.com/gyenesnat/status/1220020473955635200' + url = 'https://meedan.com/post/annual-report-2022' id = Media.get_id(url) get :index, params: { url: url, refresh: '0', format: :html } first_parsed_at = Pender::Store.current.get(id, :html).last_modified @@ -64,19 +62,6 @@ def setup assert_equal first_parsed_at, second_parsed_at end - test "should return error message on hash if twitter url does not exist" do - skip("twitter api key is not currently working") - authenticate_with_token - get :index, params: { url: 'https://twitter.com/caiosba32153623', format: :json } - assert_response 200 - data = JSON.parse(@response.body)['data'] - assert_match /Twitter::Error::NotFound: [0-9]+ User not found./, data['raw']['api']['error']['message'] - assert_equal Lapis::ErrorCodes::const_get('INVALID_VALUE'), data['raw']['api']['error']['code'] - assert_equal 'twitter', data['provider'] - assert_equal 'profile', data['type'] - assert_not_nil data['embed_tag'] - end - test "should return error message on hash if url does not exist" do authenticate_with_token get :index, params: { url: 'https://www.instagram.com/kjdahsjkdhasjdkhasjk/', format: :json } @@ -114,36 +99,6 @@ def setup assert_not_nil data['embed_tag'] end - test "should return error message on hash if twitter post url does not exist" do - skip("twitter api key is not currently working") - twitter_client, status, user = "" , "", "" - api={"error"=>{"message"=>"Twitter::Error::NotFound: 144 No status found with that ID.", "code"=>4}} - Media.any_instance.stubs(:twitter_client).returns(twitter_client) - twitter_client.stubs(:status).returns(status) - twitter_client.stubs(:user).returns(user) - user.stubs(:url).returns('') - status.stubs(:as_json).returns(api) - authenticate_with_token - get :index, params: { url: 'https://twitter.com/caiosba/status/0000000000000', format: :json } - assert_response 200 - data = JSON.parse(@response.body)['data'] - assert_match /Twitter::Error::NotFound: [0-9]+/, data['raw']['api']['error']['message'] - assert_equal Lapis::ErrorCodes::const_get('INVALID_VALUE'), data['raw']['api']['error']['code'] - assert_equal 'twitter', data['provider'] - assert_equal 'item', data['type'] - assert_not_nil data['embed_tag'] - end - - test "should parse facebook url when fb post url does not exist" do - authenticate_with_token - get :index, params: { url: 'https://www.facebook.com/ahlam.alialshamsi/posts/000000000000000', format: :json } - assert_response 200 - data = JSON.parse(@response.body)['data'] - assert_equal 'facebook', data['provider'] - assert_equal 'item', data['type'] - assert_not_nil data['embed_tag'] - end - test "should return error message on hash if as_json raises error" do Media.any_instance.stubs(:as_json).raises(RuntimeError) authenticate_with_token @@ -199,28 +154,21 @@ def setup end test "should render default HTML if not provided by oEmbed" do - get :index, params: { url: 'https://twitter.com/check', format: :html } + get :index, params: { url: 'https://meedan.com/post/annual-report-2022', format: :html } assert_response :success assert_match /pender-title/, response.body end - test "should create cache file" do - Media.any_instance.expects(:as_json).once.returns({}) - get :index, params: { url: 'http://twitter.com/caiosba', format: :html } - get :index, params: { url: 'http://twitter.com/caiosba', format: :html } - end - test "should return timeout error" do api_key = create_api_key application_settings: { config: { timeout: '0.001' }} authenticate_with_token(api_key) - get :index, params: { url: 'https://twitter.com/IronMaiden', format: :json } + get :index, params: { url: 'https://meedan.com/post/annual-report-2022', format: :json } assert_response 200 assert_equal 'Timeout', JSON.parse(@response.body)['data']['error']['message'] end test "should render custom HTML if provided by parser" do - skip("twitter api key is not currently working") get :index, params: { url: 'https://twitter.com/caiosba/status/742779467521773568', format: :html } assert_response :success assert_match /twitter-tweet/, response.body @@ -251,13 +199,12 @@ def setup end test "should clear cache for multiple URLs sent as array" do - skip("twitter api key is not currently working") authenticate_with_token url1 = 'https://meedan.com' - url2 = 'https://twitter.com/caiosba/status/742779467521773568' + url2 = 'https://meedan.com/post/annual-report-2022' normalized_url1 = 'https://meedan.com/' - normalized_url2 = 'https://twitter.com/caiosba/status/742779467521773568' + normalized_url2 = 'https://meedan.com/post/annual-report-2022' id1 = Media.get_id(normalized_url1) id2 = Media.get_id(normalized_url2) @@ -327,11 +274,11 @@ def setup end test "should redirect and remove unsupported parameters if format is HTML and URL is the only supported parameter provided" do - url = 'https://twitter.com/caiosba/status/923697122855096320' + url = 'https://meedan.com/post/annual-report-2022' get :index, params: { url: url, foo: 'bar', format: :html } assert_response 302 - assert_equal 'api/medias.html?url=https%3A%2F%2Ftwitter.com%2Fcaiosba%2Fstatus%2F923697122855096320', @response.redirect_url.split('/', 4).last + assert_equal 'api/medias.html?url=https%3A%2F%2Fmeedan.com%2Fpost%2Fannual-report-2022', @response.redirect_url.split('/', 4).last get :index, params: { url: url, foo: 'bar', format: :js } assert_response 200 @@ -353,7 +300,7 @@ def setup test "should return timeout error with minimal data if cannot parse url" do stub_configs({ 'timeout' => 0.1 }) do - url = 'https://changescamming.net/halalan-2019/maria-ressa-to-bong-go-um-attend-ka-ng-senatorial-debate-di-yung-nagtatapon-ka-ng-pera' + url = 'https://meedan.com/post/annual-report-2022' PenderSentry.stubs(:notify).never authenticate_with_token @@ -369,7 +316,6 @@ def setup end test "should not archive in any archiver when no archiver parameter is sent" do - skip("twitter api key is not currently working") Media.any_instance.unstub(:archive_to_archive_org) a = create_api_key application_settings: { 'webhook_url': 'https://example.com/webhook.php', 'webhook_token': 'test' } @@ -380,7 +326,7 @@ def setup WebMock.stub_request(:post, /example.com\/webhook/).to_return(status: 200, body: '') authenticate_with_token(a) - url = 'https://twitter.com/meedan/status/1095693211681673218' + url = 'https://meedan.com/post/annual-report-2022' get :index, params: { url: url, format: :json } id = Media.get_id(url) assert_equal({}, Pender::Store.current.read(id, :json)[:archives].sort.to_h) @@ -389,7 +335,6 @@ def setup end test "should not archive when archiver parameter is none" do - skip("twitter api key is not currently working") Media.any_instance.unstub(:archive_to_archive_org) a = create_api_key application_settings: { 'webhook_url': 'https://example.com/webhook.php', 'webhook_token': 'test' } WebMock.enable! @@ -399,7 +344,7 @@ def setup WebMock.stub_request(:post, /example.com\/webhook/).to_return(status: 200, body: '') authenticate_with_token(a) - url = 'https://twitter.com/meedan/status/1095035775736078341' + url = 'https://meedan.com/post/annual-report-2022' get :index, params: { url: url, archivers: 'none', format: :json } id = Media.get_id(url) assert_equal({}, Pender::Store.current.read(id, :json)[:archives]) @@ -424,7 +369,7 @@ def setup WebMock.stub_request(:post, /web.archive.org\/save/).to_return(body: {job_id: 'ebb13d31-7fcf-4dce-890c-c256e2823ca0' }.to_json) WebMock.stub_request(:get, /web.archive.org\/save\/status/).to_return(body: {status: 'success', timestamp: 'timestamp'}.to_json) - url = 'https://www.nytimes.com/section/world/europe' + url = 'https://meedan.com/post/annual-report-2022' archived = {"perma_cc"=>{"location"=>"http://perma.cc/perma-cc-guid-1"}, "archive_org"=>{"location"=>"https://web.archive.org/web/timestamp/#{url}"}} authenticate_with_token(a) @@ -447,8 +392,8 @@ def setup a = create_api_key application_settings: { 'webhook_url' => 'https://example.com/webhook.php', 'webhook_token' => 'test' } authenticate_with_token(a) - url1 = 'https://twitter.com/check/status/1102991340294557696' - url2 = 'https://twitter.com/dimalb/status/1102928768673423362' + url1 = 'https://meedan.com/post/annual-report-2022' + url2 = 'https://meedan.com' MediaParserWorker.stubs(:perform_async).with(url1, a.id, false, nil) MediaParserWorker.stubs(:perform_async).with(url2, a.id, false, nil).raises(RuntimeError) post :bulk, params: { url: [url1, url2], format: :json } @@ -524,7 +469,7 @@ def setup test "should return data with error message if can't parse" do webhook_info = { 'webhook_url': 'https://example.com/webhook.php', 'webhook_token': 'test' } - url = 'https://twitter.com/meedan/status/1102990605339316224' + url = 'https://meedan.com/post/annual-report-2022' parse_error = { error: { "message"=>"RuntimeError: RuntimeError", "code"=>5}} required_fields = Media.required_fields(OpenStruct.new(url: url)) Media.stubs(:required_fields).returns(required_fields) @@ -547,7 +492,7 @@ def setup assert_equal 0, MediaParserWorker.jobs.size - url = 'https://twitter.com/meedan/status/1102990605339316224' + url = 'https://meedan.com/post/annual-report-2022' post :bulk, params: { url: url, format: :json } assert_response :success @@ -586,24 +531,9 @@ def setup assert_match /v1$/, @response.headers['Accept'] end - test "should add data url when on embed title metatag" do - skip("twitter api key is not currently working") - authenticate_with_token - twitter_client, status, user = "" , "", "" - api = {"full_text"=>"@InternetFF Our Meedani @WafHeikal will be joining the amazing line of participants at #IFF, come say hi and get a free trail to our verification tool @check" } - Media.any_instance.stubs(:twitter_client).returns(twitter_client) - twitter_client.stubs(:status).returns(status) - twitter_client.stubs(:user).returns(user) - user.stubs(:url).returns('') - status.stubs(:as_json).returns(api) - get :index, params: { url: 'https://twitter.com/meedan/status/1110219801295765504', format: :html } - assert_response :success - assert_match(" @InternetFF Our Meedani @WafHeikal will be... ", response.body) - end - test "should rescue and unlock url when raises error" do authenticate_with_token - url = 'https://twitter.com/meedan/status/1118436001570086912' + url = 'https://meedan.com/post/annual-report-2022' assert !Semaphore.new(url).locked? [:js, :json, :html].each do |format| @controller.stubs("render_as_#{format}".to_sym).raises(RuntimeError.new('error')) @@ -617,7 +547,7 @@ def setup test "should rescue and unlock url when raises error on store" do authenticate_with_token - url = 'https://twitter.com/knowloitering/status/1140462371820826624' + url = 'https://meedan.com/post/annual-report-2022' assert !Semaphore.new(url).locked? Pender::Store.any_instance.stubs(:read).raises(RuntimeError.new('error')) [:js, :json, :html].each do |format| @@ -632,7 +562,7 @@ def setup end test "should unlock url after timeout" do - url = 'https://twitter.com/knowloitering/' + url = 'https://meedan.com/post/annual-report-2022' s = Semaphore.new(url) assert !s.locked? @@ -689,9 +619,8 @@ def setup end test "should cache json and html on file" do - skip("twitter api key is not currently working") authenticate_with_token - url = 'https://twitter.com/meedan/status/1132948729424691201' + url = 'https://meedan.com/post/annual-report-2022' id = Media.get_id(url) [:html, :json].each do |type| assert !Pender::Store.current.read(id, type), "#{id}.#{type} should not exist" @@ -710,18 +639,6 @@ def setup assert_match /fishermen/, JSON.parse(@response.body)['data']['title'].downcase end - test "should parse suspended Twitter profile" do - authenticate_with_token - - url = 'https://twitter.com/g9wuortn6sve9fn/status/940956917010259970' - get :index, params: { url: url, format: 'json' } - assert_response :success - - url = 'https://twitter.com/account/suspended' - get :index, params: { url: url, format: 'json' } - assert_response :success - end - test "should get config from api key if defined" do @controller.stubs(:unload_current_config) api_key = create_api_key application_settings: { config: { }} @@ -737,17 +654,6 @@ def setup assert_equal 'api_config_value', PenderConfig.get('key_for_test') end - test "should return API limit reached error" do - skip("twitter api key is not currently working") - Twitter::REST::Client.any_instance.stubs(:user).raises(Twitter::Error::TooManyRequests) - Twitter::Error::TooManyRequests.any_instance.stubs(:rate_limit).returns(OpenStruct.new(reset_in: 123)) - - authenticate_with_token - get :index, params: { url: 'http://twitter.com/meedan', format: :json } - assert_response 429 - assert_equal 123, JSON.parse(@response.body)['data']['message'] - end - test "should add url on title when timeout" do api_key = create_api_key application_settings: { config: { timeout: '0.001' }} authenticate_with_token(api_key) diff --git a/test/data/twitter-item-response-error.json b/test/data/twitter-item-response-error.json new file mode 100644 index 00000000..86132bcd --- /dev/null +++ b/test/data/twitter-item-response-error.json @@ -0,0 +1,13 @@ +{ + "errors": [ + { + "value": "1111111111111111111", + "detail": "Could not find tweet with ids: [1111111111111111111].", + "title": "Not Found Error", + "resource_type": "tweet", + "parameter": "ids", + "resource_id": "1111111111111111111", + "type": "https://api.twitter.com/2/problems/resource-not-found" + } + ] +} \ No newline at end of file diff --git a/test/data/twitter-item-response-success.json b/test/data/twitter-item-response-success.json new file mode 100644 index 00000000..000a472a --- /dev/null +++ b/test/data/twitter-item-response-success.json @@ -0,0 +1,36 @@ +{ + "data": [ + { + "id": "1111111111111111111", + "edit_history_tweet_ids": [ + "1111111111111111111" + ], + "attachments": { + "media_keys": [ + "3_1686748606936612864" + ] + }, + "author_id": "29472803", + "created_at": "2023-08-02T14:39:42.000Z", + "text": "Youths!\n\nWebb observed galaxy cluster El Gordo, a cosmic teen that existed 6.2 billion years after the big bang. The most massive cluster of its era, it’s a perfect gravitational magnifying glass, bending & distorting light from distant objects behind it: https://t.co/BrYH55h77F https://t.co/JK4XFxdUQX" + } + ], + "includes": { + "media": [ + { + "media_key": "3_1686748606936612864", + "type": "photo", + "url": "https://pbs.twimg.com/media/F2iI69XXgAAUo5Z.jpg" + } + ], + "users": [ + { + "username": "fake_user", + "url": "https://t.co/ZpTf8zeokA", + "name": "NASA Webb Telescope", + "profile_image_url": "https://pbs.twimg.com/profile_images/685182791496134658/Wmyak8D6_normal.jpg", + "id": "29472803" + } + ] + } +} \ No newline at end of file diff --git a/test/data/twitter-item-response.json b/test/data/twitter-item-response.json deleted file mode 100644 index 89003310..00000000 --- a/test/data/twitter-item-response.json +++ /dev/null @@ -1,107 +0,0 @@ -{ - "created_at": "Tue Jun 14 18:03:19 +0000 2016", - "id": 742779467521773568, - "id_str": "742779467521773568", - "full_text": "I'll be talking in @rubyconfbr this year! More details soon...", - "truncated": false, - "display_text_range": [ - 0, - 62 - ], - "entities": { - "hashtags": [], - "symbols": [], - "user_mentions": [ - { - "screen_name": "rubyconfbr", - "name": "RubyConf Brasil", - "id": 171149416, - "id_str": "171149416", - "indices": [ - 19, - 30 - ] - } - ], - "urls": [] - }, - "source": "Twitter Web Client", - "in_reply_to_status_id": null, - "in_reply_to_status_id_str": null, - "in_reply_to_user_id": null, - "in_reply_to_user_id_str": null, - "in_reply_to_screen_name": null, - "user": { - "id": 21530857, - "id_str": "21530857", - "name": "Caio Almeida", - "screen_name": "caiosba", - "location": "Salvador - Bahia - Brazil", - "description": "• Bachelor and Master on Computer Science (UFBA)\n • CTO at @meedan\n\n (opinions are mine) (ele/dele - he/him)", - "url": "https://t.co/4v1thkKsvE", - "entities": { - "url": { - "urls": [ - { - "url": "https://t.co/4v1thkKsvE", - "expanded_url": "https://ca.ios.ba", - "display_url": "ca.ios.ba", - "indices": [ - 0, - 23 - ] - } - ] - }, - "description": { - "urls": [] - } - }, - "protected": false, - "followers_count": 460, - "friends_count": 284, - "listed_count": 22, - "created_at": "Sun Feb 22 00:22:58 +0000 2009", - "favourites_count": 2399, - "utc_offset": null, - "time_zone": null, - "geo_enabled": true, - "verified": false, - "statuses_count": 2372, - "lang": null, - "contributors_enabled": false, - "is_translator": false, - "is_translation_enabled": false, - "profile_background_color": "000000", - "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png", - "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png", - "profile_background_tile": true, - "profile_image_url": "http://pbs.twimg.com/profile_images/1217299193217388544/znpkNtDr_normal.jpg", - "profile_image_url_https": "https://pbs.twimg.com/profile_images/1217299193217388544/znpkNtDr_normal.jpg", - "profile_banner_url": "https://pbs.twimg.com/profile_banners/21530857/1478968136", - "profile_link_color": "064A12", - "profile_sidebar_border_color": "404345", - "profile_sidebar_fill_color": "E6E6E6", - "profile_text_color": "333333", - "profile_use_background_image": true, - "has_extended_profile": true, - "default_profile": false, - "default_profile_image": false, - "following": false, - "follow_request_sent": false, - "notifications": false, - "translator_type": "none", - "withheld_in_countries": [] - }, - "geo": null, - "coordinates": null, - "place": null, - "contributors": null, - "is_quote_status": false, - "retweet_count": 1, - "favorite_count": 1, - "favorited": false, - "retweeted": false, - "lang": "en", - "text": "I'll be talking in @rubyconfbr this year! More details soon..." -} diff --git a/test/data/twitter-profile-response-error.json b/test/data/twitter-profile-response-error.json new file mode 100644 index 00000000..41a92331 --- /dev/null +++ b/test/data/twitter-profile-response-error.json @@ -0,0 +1,13 @@ +{ + "errors": [ + { + "value": "fake_user_nonexistent", + "detail": "Could not find user with usernames: [fake_user_nonexistent].", + "title": "Not Found Error", + "resource_type": "user", + "parameter": "usernames", + "resource_id": "fake_user_nonexistent", + "type": "https://api.twitter.com/2/problems/resource-not-found" + } + ] +} \ No newline at end of file diff --git a/test/data/twitter-profile-response-success.json b/test/data/twitter-profile-response-success.json new file mode 100644 index 00000000..a254a938 --- /dev/null +++ b/test/data/twitter-profile-response-success.json @@ -0,0 +1,13 @@ +{ + "data": [ + { + "name": "Fake User", + "username": "fake_user", + "profile_image_url": "https://pbs.twimg.com/profile_images/685182791496134658/Wmyak8D6_normal.jpg", + "description": "The world's most powerful space telescope. Launched: Dec. 25, 2021. First images revealed: July 12, 2022.\n\nVerification: https://t.co/ChOEslj1j5", + "created_at": "2009-04-07T15:40:56.000Z", + "id": "11111111", + "url": "https://t.co/ZpTf8zeokA" + } + ] +} \ No newline at end of file diff --git a/test/data/twitter-profile-response.json b/test/data/twitter-profile-response.json deleted file mode 100644 index 9cba0327..00000000 --- a/test/data/twitter-profile-response.json +++ /dev/null @@ -1,113 +0,0 @@ -{ - "id": 15492359, - "id_str": "15492359", - "name": "TED Talks", - "screen_name": "TEDTalks", - "location": "New York, NY", - "profile_location": "", - "description": "TED is a nonprofit devoted to spreading ideas. 🔴 Help build a better future, become a TED Member today: https://t.co/NSSThTmhbv", - "url": "https://t.co/pfSl30mSY7", - "entities": { - "url": { - "urls": [ - { - "url": "https://t.co/pfSl30mSY7", - "expanded_url": "http://www.ted.com", - "display_url": "ted.com", - "indices": [ - 0, - 23 - ] - } - ] - }, - "description": { - "urls": [ - { - "url": "https://t.co/NSSThTmhbv", - "expanded_url": "http://t.ted.com/DRMjJmI", - "display_url": "t.ted.com/DRMjJmI", - "indices": [ - 105, - 128 - ] - } - ] - } - }, - "protected": false, - "followers_count": 11422446, - "friends_count": 690, - "listed_count": 52829, - "created_at": "Sat Jul 19 13:22:50 +0000 2008", - "favourites_count": 6889, - "utc_offset": "", - "time_zone": "", - "geo_enabled": false, - "verified": true, - "statuses_count": 42022, - "lang": "", - "status": { - "created_at": "Wed Aug 24 12:04:38 +0000 2022", - "id": 1562410550721916929, - "id_str": "1562410550721916929", - "text": "Whether you have a uterus or not — you should know this essential information: https://t.co/7FLTIrqjgD", - "truncated": false, - "entities": { - "hashtags": [], - "symbols": [], - "user_mentions": [], - "urls": [ - { - "url": "https://t.co/7FLTIrqjgD", - "expanded_url": "http://t.ted.com/76DGPfW", - "display_url": "t.ted.com/76DGPfW", - "indices": [ - 79, - 102 - ] - } - ] - }, - "source": "SocialFlow", - "in_reply_to_status_id": "", - "in_reply_to_status_id_str": "", - "in_reply_to_user_id": "", - "in_reply_to_user_id_str": "", - "in_reply_to_screen_name": "", - "geo": "", - "coordinates": "", - "place": "", - "contributors": "", - "is_quote_status": false, - "retweet_count": 26, - "favorite_count": 120, - "favorited": false, - "retweeted": false, - "possibly_sensitive": false, - "lang": "en" - }, - "contributors_enabled": false, - "is_translator": false, - "is_translation_enabled": false, - "profile_background_color": "000000", - "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png", - "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png", - "profile_background_tile": false, - "profile_image_url": "http://pbs.twimg.com/profile_images/877631054525472768/Xp5FAPD5_normal.jpg", - "profile_image_url_https": "https://pbs.twimg.com/profile_images/877631054525472768/Xp5FAPD5_normal.jpg", - "profile_banner_url": "https://pbs.twimg.com/profile_banners/15492359/1657572511", - "profile_link_color": "FF2B06", - "profile_sidebar_border_color": "FFFFFF", - "profile_sidebar_fill_color": "E0E3DE", - "profile_text_color": "333333", - "profile_use_background_image": true, - "has_extended_profile": false, - "default_profile": false, - "default_profile_image": false, - "following": false, - "follow_request_sent": false, - "notifications": false, - "translator_type": "none", - "withheld_in_countries": [] -} diff --git a/test/integration/parsers/page_item_test.rb b/test/integration/parsers/page_item_test.rb new file mode 100644 index 00000000..d4b8c0d2 --- /dev/null +++ b/test/integration/parsers/page_item_test.rb @@ -0,0 +1,129 @@ +require 'test_helper' + +class PageItemIntegrationTest < ActiveSupport::TestCase + test "should parse a given site" do + m = create_media url: 'https://noticias.uol.com.br/' + data = m.as_json + assert_equal 'item', data['type'] + assert_equal 'page', data['provider'] + assert_match /Acompanhe as últimas notícias do Brasil e do mundo/, data['title'] + assert_not_nil data['description'] + assert_not_nil data['published_at'] + assert_equal '', data['username'] + assert_equal 'https://noticias.uol.com.br', data['author_url'] + assert_equal 'UOLNoticias @UOL', data['author_name'] + assert_not_nil data['picture'] + assert_nil data['error'] + end + + test "should parse arabic url page" do + url = 'http://www.youm7.com/story/2016/7/6/بالصور-مياه-الشرب-بالإسماعيلية-تواصل-عملها-لحل-مشكلة-طفح-الصرف/2790125' + id = Media.get_id url + m = create_media url: url + data = m.as_json + assert !data['title'].blank? + assert_not_nil data['published_at'] + assert_equal '', data['username'] + end + + test "should parse url with arabic or already encoded chars" do + urls = [ + 'https://www.aljazeera.net/news/2023/2/9/الشرطة-السويدية-ترفض-منح-إذن-لحرق', + 'https://www.aljazeera.net/news/2023/2/9/%D8%A7%D9%84%D8%B4%D8%B1%D8%B7%D8%A9-%D8%A7%D9%84%D8%B3%D9%88%D9%8A%D8%AF%D9%8A%D8%A9-%D8%AA%D8%B1%D9%81%D8%B6-%D9%85%D9%86%D8%AD-%D8%A5%D8%B0%D9%86-%D9%84%D8%AD%D8%B1%D9%82' + ] + urls.each do |url| + m = create_media url: url + data = m.as_json + assert_equal 'الشرطة السويدية ترفض منح إذن جديد لحرق المصحف الشريف أمام السفارة التركية.. فما السبب؟', data['title'] + assert_equal 'رفضت الشرطة السويدية منح إذن لحرق المصحف الشريف أمام السفارة التركية، قائلة إن ذلك من شأنه “إثارة اضطرابات خطيرة للأمن القومي”.', data['description'] + assert_equal '', data['published_at'] + assert_equal '', data['username'] + assert_match /^https?:\/\/www\.aljazeera\.net$/, data['author_url'] + assert_nil data['error'] + assert_not_nil data['picture'] + end + end + + test "should store metatags in an Array" do + m = create_media url: 'https://www.nytimes.com/2017/06/14/us/politics/mueller-trump-special-counsel-investigation.html' + data = m.as_json + assert data['raw']['metatags'].is_a? Array + assert !data['raw']['metatags'].empty? + end + + test "should handle exception when raises some error when getting oembed data" do + url = 'https://www.hongkongfp.com/2017/03/01/hearing-begins-in-govt-legal-challenge-against-4-rebel-hong-kong-lawmakers/' + m = create_media url: url + OembedItem.any_instance.stubs(:get_oembed_data_from_url).raises(StandardError) + data = m.as_json + assert_equal 'item', data['type'] + assert_equal 'page', data['provider'] + assert_match(/Hong Kong Free Press/, data['title']) + assert_match(/Hong Kong/, data['description']) + assert_not_nil data['published_at'] + assert_match /https:\/\/.+AFP/, data['author_url'] + assert_not_nil data['picture'] + assert_match(/StandardError/, data['error']['message']) + end + + test "should parse pages when the scheme is missing on oembed url" do + url = 'https://www.hongkongfp.com/2017/03/01/hearing-begins-in-govt-legal-challenge-against-4-rebel-hong-kong-lawmakers/' + m = create_media url: url + Parser::PageItem.any_instance.stubs(:oembed_url).returns('//www.hongkongfp.com/wp-json/oembed/1.0/embed?url=https%3A%2F%2Fwww.hongkongfp.com%2F2017%2F03%2F01%2Fhearing-begins-in-govt-legal-challenge-against-4-rebel-hong-kong-lawmakers%2F') + data = m.as_json + assert_equal 'item', data['type'] + assert_equal 'page', data['provider'] + assert_match(/Hong Kong Free Press/, data['title']) + assert_match(/Hong Kong/, data['description']) + assert_not_nil data['published_at'] + assert_match /https:\/\/.+AFP/, data['author_url'] + assert_not_nil data['picture'] + assert_nil data['error'] + end + + test "should parse url scheme http" do + url = 'http://www.theatlantic.com/magazine/archive/2016/11/war-goes-viral/501125/' + m = create_media url: url + data = m.as_json + assert_match 'War Goes Viral', data['title'] + assert_match 'How social media is being weaponized across the world', data['description'] + assert !data['published_at'].blank? + assert_match /Brooking.+Singer/, data['username'] + assert_match /https?:\/\/www.theatlantic.com/, data['author_url'] + assert_not_nil data['picture'] + end + + test "should parse url scheme https" do + url = 'https://www.theguardian.com/politics/2016/oct/19/larry-sanders-on-brother-bernie-and-why-tony-blair-was-destructive' + m = create_media url: url + data = m.as_json + assert_match 'Larry Sanders on brother Bernie and why Tony Blair was ‘destructive’', data['title'] + assert_match /The Green party candidate, who is fighting the byelection in David Cameron’s old seat/, data['description'] + assert_match /2016-10/, data['published_at'] + assert_match '@zoesqwilliams', data['username'] + assert_match 'https://twitter.com/zoesqwilliams', data['author_url'] + assert !data['picture'].blank? + end + + test "should parse urls without utf encoding" do + urls = [ + 'https://www.yallakora.com/epl/2545/News/350853/مصدر-ليلا-كورة-ليفربول-حذر-صلاح-وزملاءه-من-جماهير-فيديو-السيارة', + 'https://www.yallakora.com/epl/2545/News/350853/%D9%85%D8%B5%D8%AF%D8%B1-%D9%84%D9%8A%D9%84%D8%A7-%D9%83%D9%88%D8%B1%D8%A9-%D9%84%D9%8A%D9%81%D8%B1%D8%A8%D9%88%D9%84-%D8%AD%D8%B0%D8%B1-%D8%B5%D9%84%D8%A7%D8%AD-%D9%88%D8%B2%D9%85%D9%84%D8%A7%D8%A1%D9%87-%D9%85%D9%86-%D8%AC%D9%85%D8%A7%D9%87%D9%8A%D8%B1-%D9%81%D9%8A%D8%AF%D9%8A%D9%88-%D8%A7%D9%84%D8%B3%D9%8A%D8%A7%D8%B1%D8%A9', + 'https://www.yallakora.com//News/350853/%25D9%2585%25D8%25B5%25D8%25AF%25D8%25B1-%25D9%2584%25D9%258A%25D9%2584%25D8%25A7-%25D9%2583%25D9%2588%25D8%25B1%25D8%25A9-%25D9%2584%25D9%258A%25D9%2581%25D8%25B1%25D8%25A8%25D9%2588%25D9%2584-%25D8%25AD%25D8%25B0%25D8%25B1-%25D8%25B5%25D9%2584%25D8%25A7%25D8%25AD-%25D9%2588%25D8%25B2%25D9%2585%25D9%2584%25D8%25A7%25D8%25A1%25D9%2587-%25D9%2585%25D9%2586-%25D8%25AC%25D9%2585%25D8%25A7%25D9%2587%25D9%258A%25D8%25B1-%25D9%2581%25D9%258A%25D8%25AF%25D9%258A%25D9%2588-%25D8%25A7%25D9%2584%25D8%25B3%25D9%258A%25D8%25A7%25D8%25B1%25D8%25A9-' + ] + urls.each do |url| + m = create_media url: url + data = m.as_json + assert data['error'].nil? + end + end + + test "should use original url when redirected page requires cookie" do + RequestHelper.stubs(:get_html).returns(Nokogiri::HTML("")) + url = 'https://doi.org/10.1080/10584609.2019.1619639' + m = create_media url: url + data = m.as_json + assert_equal url, data['url'] + assert_nil data['error'] + end +end diff --git a/test/integration/parsers/twitter_item_test.rb b/test/integration/parsers/twitter_item_test.rb new file mode 100644 index 00000000..96a016e6 --- /dev/null +++ b/test/integration/parsers/twitter_item_test.rb @@ -0,0 +1,25 @@ +require 'test_helper' + +class TwitterItemIntegrationTest < ActiveSupport::TestCase + test "should parse tweet from a successful link" do + m = create_media url: 'https://twitter.com/caiosba/status/742779467521773568' + data = m.as_json + assert_match 'I\'ll be talking in @rubyconfbr this year! More details soon...', data['title'] + assert_match 'Caio Almeida', data['author_name'] + assert_match '@caiosba', data['username'] + assert_match '', data['picture'] + assert_not_nil data['author_picture'] + end + + test "should return data even if a the twitter item does not exist" do + m = create_media url: 'https://twitter.com/caiosba/status/1111111111111111111' + data = m.as_json + assert_match 'https://twitter.com/caiosba/status/1111111111111111111', data['title'] + assert_match 'caiosba', data['author_name'] + assert_match '@caiosba', data['username'] + assert_not_nil data['author_picture'] + assert_match /Could not find/, data['error'][0]['detail'] + assert_match /Not Found Error/, data['error'][0]['title'] + end +end + diff --git a/test/integration/parsers/twitter_profile_test.rb b/test/integration/parsers/twitter_profile_test.rb new file mode 100644 index 00000000..ae173034 --- /dev/null +++ b/test/integration/parsers/twitter_profile_test.rb @@ -0,0 +1,29 @@ +require 'test_helper' + +class TwitterProfileIntegrationTest < ActiveSupport::TestCase + test "should parse shortened URL" do + m = create_media url: 'http://bit.ly/23qFxCn' + data = m.as_json + assert_equal 'https://twitter.com/caiosba', data['url'] + assert_not_nil data['title'] + assert_match '@caiosba', data['username'] + assert_equal 'twitter', data['provider'] + assert_not_nil data['description'] + assert_not_nil data['picture'] + assert_not_nil data['published_at'] + end + + test "should return data even if a the twitter profile does not exist" do + m = create_media url: 'https://twitter.com/dlihfbfyhugsrb' + data = m.as_json + assert_equal 'https://twitter.com/dlihfbfyhugsrb', data['url'] + assert_equal 'dlihfbfyhugsrb', data['title'] + assert_match '@dlihfbfyhugsrb', data['username'] + assert_equal 'twitter', data['provider'] + assert_not_nil data['description'] + assert_not_nil data['picture'] + assert_not_nil data['published_at'] + assert_match /Could not find user/, data['error'][0]['detail'] + assert_match /Not Found Error/, data['error'][0]['title'] + end +end diff --git a/test/models/parser/page_item_test.rb b/test/models/parser/page_item_test.rb index eb90e72c..e8e5e10e 100644 --- a/test/models/parser/page_item_test.rb +++ b/test/models/parser/page_item_test.rb @@ -1,148 +1,10 @@ require 'test_helper' -class PageItemIntegrationTest < ActiveSupport::TestCase - test "should parse a given site" do - m = create_media url: 'https://noticias.uol.com.br/' - data = m.as_json - assert_equal 'item', data['type'] - assert_equal 'page', data['provider'] - assert_match /Acompanhe as últimas notícias do Brasil e do mundo/, data['title'] - assert_not_nil data['description'] - assert_not_nil data['published_at'] - assert_equal '', data['username'] - assert_equal 'https://noticias.uol.com.br', data['author_url'] - assert_equal 'UOLNoticias @UOL', data['author_name'] - assert_not_nil data['picture'] - assert_nil data['error'] - end - - test "should parse arabic url page" do - url = 'http://www.youm7.com/story/2016/7/6/بالصور-مياه-الشرب-بالإسماعيلية-تواصل-عملها-لحل-مشكلة-طفح-الصرف/2790125' - id = Media.get_id url - m = create_media url: url - data = m.as_json - assert !data['title'].blank? - assert_not_nil data['published_at'] - assert_equal '', data['username'] - end - - test "should parse url with arabic or already encoded chars" do - urls = [ - 'https://www.aljazeera.net/news/2023/2/9/الشرطة-السويدية-ترفض-منح-إذن-لحرق', - 'https://www.aljazeera.net/news/2023/2/9/%D8%A7%D9%84%D8%B4%D8%B1%D8%B7%D8%A9-%D8%A7%D9%84%D8%B3%D9%88%D9%8A%D8%AF%D9%8A%D8%A9-%D8%AA%D8%B1%D9%81%D8%B6-%D9%85%D9%86%D8%AD-%D8%A5%D8%B0%D9%86-%D9%84%D8%AD%D8%B1%D9%82' - ] - urls.each do |url| - m = create_media url: url - data = m.as_json - assert_equal 'الشرطة السويدية ترفض منح إذن جديد لحرق المصحف الشريف أمام السفارة التركية.. فما السبب؟', data['title'] - assert_equal 'رفضت الشرطة السويدية منح إذن لحرق المصحف الشريف أمام السفارة التركية، قائلة إن ذلك من شأنه “إثارة اضطرابات خطيرة للأمن القومي”.', data['description'] - assert_equal '', data['published_at'] - assert_equal '', data['username'] - assert_match /^https?:\/\/www\.aljazeera\.net$/, data['author_url'] - assert_nil data['error'] - assert_not_nil data['picture'] - end - end - - test "should store metatags in an Array" do - m = create_media url: 'https://www.nytimes.com/2017/06/14/us/politics/mueller-trump-special-counsel-investigation.html' - data = m.as_json - assert data['raw']['metatags'].is_a? Array - assert !data['raw']['metatags'].empty? - end - - test "should handle exception when raises some error when getting oembed data" do - url = 'https://www.hongkongfp.com/2017/03/01/hearing-begins-in-govt-legal-challenge-against-4-rebel-hong-kong-lawmakers/' - m = create_media url: url - OembedItem.any_instance.stubs(:get_oembed_data_from_url).raises(StandardError) - data = m.as_json - assert_equal 'item', data['type'] - assert_equal 'page', data['provider'] - assert_match(/Hong Kong Free Press/, data['title']) - assert_match(/Hong Kong/, data['description']) - assert_not_nil data['published_at'] - assert_match /https:\/\/.+AFP/, data['author_url'] - assert_not_nil data['picture'] - assert_match(/StandardError/, data['error']['message']) - end - - test "should parse pages when the scheme is missing on oembed url" do - url = 'https://www.hongkongfp.com/2017/03/01/hearing-begins-in-govt-legal-challenge-against-4-rebel-hong-kong-lawmakers/' - m = create_media url: url - Parser::PageItem.any_instance.stubs(:oembed_url).returns('//www.hongkongfp.com/wp-json/oembed/1.0/embed?url=https%3A%2F%2Fwww.hongkongfp.com%2F2017%2F03%2F01%2Fhearing-begins-in-govt-legal-challenge-against-4-rebel-hong-kong-lawmakers%2F') - data = m.as_json - assert_equal 'item', data['type'] - assert_equal 'page', data['provider'] - assert_match(/Hong Kong Free Press/, data['title']) - assert_match(/Hong Kong/, data['description']) - assert_not_nil data['published_at'] - assert_match /https:\/\/.+AFP/, data['author_url'] - assert_not_nil data['picture'] - assert_nil data['error'] - end - - test "should parse url scheme http" do - url = 'http://www.theatlantic.com/magazine/archive/2016/11/war-goes-viral/501125/' - m = create_media url: url - data = m.as_json - assert_match 'War Goes Viral', data['title'] - assert_match 'How social media is being weaponized across the world', data['description'] - assert !data['published_at'].blank? - assert_match /Brooking.+Singer/, data['username'] - assert_match /https?:\/\/www.theatlantic.com/, data['author_url'] - assert_not_nil data['picture'] - end - - test "should parse url scheme https" do - skip("twitter api key is not currently working") - url = 'https://www.theguardian.com/politics/2016/oct/19/larry-sanders-on-brother-bernie-and-why-tony-blair-was-destructive' - m = create_media url: url - data = m.as_json - assert_match 'Larry Sanders on brother Bernie and why Tony Blair was ‘destructive’', data['title'] - assert_match /The Green party candidate, who is fighting the byelection in David Cameron’s old seat/, data['description'] - assert_match /2016-10/, data['published_at'] - assert_match '@zoesqwilliams', data['username'] - assert_match 'https://twitter.com/zoesqwilliams', data['author_url'] - assert !data['picture'].blank? - end - - test "should parse urls without utf encoding" do - urls = [ - 'https://www.yallakora.com/epl/2545/News/350853/مصدر-ليلا-كورة-ليفربول-حذر-صلاح-وزملاءه-من-جماهير-فيديو-السيارة', - 'https://www.yallakora.com/epl/2545/News/350853/%D9%85%D8%B5%D8%AF%D8%B1-%D9%84%D9%8A%D9%84%D8%A7-%D9%83%D9%88%D8%B1%D8%A9-%D9%84%D9%8A%D9%81%D8%B1%D8%A8%D9%88%D9%84-%D8%AD%D8%B0%D8%B1-%D8%B5%D9%84%D8%A7%D8%AD-%D9%88%D8%B2%D9%85%D9%84%D8%A7%D8%A1%D9%87-%D9%85%D9%86-%D8%AC%D9%85%D8%A7%D9%87%D9%8A%D8%B1-%D9%81%D9%8A%D8%AF%D9%8A%D9%88-%D8%A7%D9%84%D8%B3%D9%8A%D8%A7%D8%B1%D8%A9', - 'https://www.yallakora.com//News/350853/%25D9%2585%25D8%25B5%25D8%25AF%25D8%25B1-%25D9%2584%25D9%258A%25D9%2584%25D8%25A7-%25D9%2583%25D9%2588%25D8%25B1%25D8%25A9-%25D9%2584%25D9%258A%25D9%2581%25D8%25B1%25D8%25A8%25D9%2588%25D9%2584-%25D8%25AD%25D8%25B0%25D8%25B1-%25D8%25B5%25D9%2584%25D8%25A7%25D8%25AD-%25D9%2588%25D8%25B2%25D9%2585%25D9%2584%25D8%25A7%25D8%25A1%25D9%2587-%25D9%2585%25D9%2586-%25D8%25AC%25D9%2585%25D8%25A7%25D9%2587%25D9%258A%25D8%25B1-%25D9%2581%25D9%258A%25D8%25AF%25D9%258A%25D9%2588-%25D8%25A7%25D9%2584%25D8%25B3%25D9%258A%25D8%25A7%25D8%25B1%25D8%25A9-' - ] - urls.each do |url| - m = create_media url: url - data = m.as_json - assert data['error'].nil? - end - end - - test "should use original url when redirected page requires cookie" do - RequestHelper.stubs(:get_html).returns(Nokogiri::HTML("")) - url = 'https://doi.org/10.1080/10584609.2019.1619639' - m = create_media url: url - data = m.as_json - assert_equal url, data['url'] - assert_nil data['error'] - end - - test "should handle error when cannot get twitter url" do - Parser::PageItem.stubs(:twitter_client).raises(Twitter::Error::Forbidden) - m = create_media url: 'http://example.com' - data = m.as_json - assert data['error'].nil? - Parser::PageItem.unstub(:twitter_client) - end -end - class PageItemUnitTest < ActiveSupport::TestCase def setup isolated_setup WebMock.stub_request(:post, /safebrowsing.googleapis.com/).to_return(status: 200, body: { matches: [] }.to_json ) OembedItem.any_instance.stubs(:get_data).returns({}) - Twitter::REST::Client.any_instance.stubs(:user) end def teardown @@ -163,13 +25,6 @@ def throwaway_url 'https://example.com/throwaway' end - def fake_twitter_user - return @fake_twitter_user unless @fake_twitter_user.blank? - # https://github.com/sferik/twitter/blob/master/lib/twitter/user.rb - api_response = response_fixture_from_file('twitter-profile-response.json', parse_as: :json) - @fake_twitter_user = Twitter::User.new(api_response.with_indifferent_access) - end - test "returns provider and type" do assert_equal Parser::PageItem.type, 'page_item' end @@ -396,13 +251,11 @@ def fake_twitter_user assert_equal 'https://piglet.com/image.png', data['picture'] end - test "sets author name as author_name, username, and then title" do - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) - + test "sets author name as author_name, username, and then title from Twitter metadata" do doc = Nokogiri::HTML(<<~HTML) - ' - ' - ' + ' + ' + ' HTML data = Parser::PageItem.new('https://example.com').parse_data(doc, throwaway_url) assert_equal "Piglet McDog", data['author_name'] @@ -463,8 +316,6 @@ def fake_twitter_user with(body: /example.com\/unsafeurl/). to_return(status: 200, body: { matches: ['fake match'] }.to_json ) - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) - # author_url doc = Nokogiri::HTML(<<~HTML) @@ -491,21 +342,26 @@ def fake_twitter_user end test "uses twitter URL from twitter metadata for author_url (and not username) if valid" do - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) - doc = Nokogiri::HTML(<<~HTML) HTML data = Parser::PageItem.new('http://example.com').parse_data(doc, throwaway_url) - assert_equal 'https://twitter.com/TEDTalks', data['author_url'] + + assert_equal 'https://twitter.com/fakeaccount', data['author_url'] assert_equal '@fakeaccount', data['username'] end + + test "shouldn't error when cannot get twitter author url" do + Parser::PageItem.stubs(:twitter_author_url).returns(nil) + + data = Parser::PageItem.new('https://random-page.com/page-item').parse_data(empty_doc) + + assert data['error'].nil? + end test "does not set author_url from twitter metadata if a default username, instead defaults to top URL" do - api_response = api_response = response_fixture_from_file('twitter-profile-response.json', parse_as: :json) - fake_twitter_user = Twitter::User.new(api_response.with_indifferent_access) - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) + api_response = response_fixture_from_file('twitter-profile-response-success.json', parse_as: :json) doc = Nokogiri::HTML(<<~HTML) diff --git a/test/models/parser/twitter_item_test.rb b/test/models/parser/twitter_item_test.rb index 1d362b3e..57e62f4f 100644 --- a/test/models/parser/twitter_item_test.rb +++ b/test/models/parser/twitter_item_test.rb @@ -1,66 +1,5 @@ require 'test_helper' -class TwitterItemIntegrationTest < ActiveSupport::TestCase - test "should parse tweet" do - skip("twitter api key is not currently working") - m = create_media url: 'https://twitter.com/caiosba/status/742779467521773568' - data = m.as_json - assert_match 'I\'ll be talking in @rubyconfbr this year! More details soon...', data['title'] - assert_match 'Caio Almeida', data['author_name'] - assert_match '@caiosba', data['username'] - assert_nil data['picture'] - assert_not_nil data['author_picture'] - end - - test "should parse valid link with spaces" do - skip("twitter api key is not currently working") - m = create_media url: ' https://twitter.com/caiosba/status/742779467521773568 ' - data = m.as_json - assert_match 'I\'ll be talking in @rubyconfbr this year! More details soon...', data['title'] - assert_match 'Caio Almeida', data['author_name'] - assert_match '@caiosba', data['username'] - assert_nil data['picture'] - assert_not_nil data['author_picture'] - end - - test "should fill in html when html parsing fails but API works" do - skip("twitter api key is not currently working") - url = 'https://twitter.com/codinghorror/status/1276934067015974912' - OpenURI.stubs(:open_uri).raises(OpenURI::HTTPError.new('','429 Too Many Requests')) - m = create_media url: url - data = m.as_json - assert_match /twitter-tweet.*#{url}/, data[:html] - end - - test "should not parse a twitter post when passing the twitter api key or subkey missing" do - key = create_api_key application_settings: { config: { twitter_consumer_key: 'consumer_key', twitter_consumer_secret: '' } } - m = create_media url: 'https://twitter.com/cal_fire/status/919029734847025152', key: key - assert_equal 'consumer_key', PenderConfig.get(:twitter_consumer_key) - assert_equal '', PenderConfig.get(:twitter_consumer_secret) - data = m.as_json - assert_equal m.url, data['title'] - assert_match "Twitter::Error::Unauthorized", data['raw']['api']['error']['message'] - PenderConfig.current = nil - - key = create_api_key application_settings: { config: { twitter_consumer_key: '' } } - m = create_media url: 'https://twitter.com/cal_fire/status/919029734847025152' , key: key - assert_equal '', PenderConfig.get(:twitter_consumer_key) - data = m.as_json - assert_equal m.url, data['title'] - assert_match "Twitter::Error::BadRequest", data['raw']['api']['error']['message'] - end - - test "should store oembed data of a twitter profile" do - skip("twitter api key is not currently working") - m = create_media url: 'https://twitter.com/meedan' - data = m.as_json - - assert data['raw']['oembed'].is_a? Hash - assert_equal "https:\/\/twitter.com", data['raw']['oembed']['provider_url'] - assert_equal "Twitter", data['raw']['oembed']['provider_name'] - end -end - class TwitterItemUnitTest < ActiveSupport::TestCase def setup isolated_setup @@ -70,22 +9,32 @@ def teardown isolated_teardown end - def fake_twitter_user - return @fake_twitter_user unless @fake_twitter_user.blank? - # https://github.com/sferik/twitter/blob/master/lib/twitter/user.rb - api_response = response_fixture_from_file('twitter-profile-response.json', parse_as: :json) - @fake_twitter_user = Twitter::User.new(api_response.with_indifferent_access) + def empty_doc + Nokogiri::HTML('') end - def fake_tweet - return @fake_tweet unless @fake_tweet.blank? - # https://github.com/sferik/twitter/blob/master/lib/twitter/tweet.rb - api_response = response_fixture_from_file('twitter-item-response.json', parse_as: :json) - @fake_tweet = Twitter::Tweet.new(api_response.with_indifferent_access) + def query + params = { + "ids": "1111111111111111111", + "tweet.fields": "author_id,created_at,text", + "expansions": "author_id,attachments.media_keys", + "user.fields": "profile_image_url,username,url", + "media.fields": "url", + } + Rack::Utils.build_query(params) end - def empty_doc - Nokogiri::HTML('') + def twitter_item_response_success + JSON.parse(response_fixture_from_file('twitter-item-response-success.json')) + end + + def twitter_item_response_error + JSON.parse(response_fixture_from_file('twitter-item-response-error.json')) + end + + def stub_tweet_lookup + Parser::TwitterItem.any_instance.stubs(:tweet_lookup) + .with('1111111111111111111') end test "returns provider and type" do @@ -122,161 +71,123 @@ def empty_doc assert_equal true, match_seven.is_a?(Parser::TwitterItem) end - test "assigns values to hash from the API response" do - Twitter::REST::Client.any_instance.stubs(:status).returns(fake_tweet) - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) - - data = Parser::TwitterItem.new('https://twitter.com/fakeaccount/status/123456789').parse_data(empty_doc) - - assert_equal '123456789', data['external_id'] - assert_equal '@fakeaccount', data['username'] - assert_match /I'll be talking in @rubyconfbr this year!/, data['title'] - assert_match /I'll be talking in @rubyconfbr this year!/, data['description'] - assert_nil data['picture'] - assert_match /pbs.twimg.com\/profile_images\/1217299193217388544\/znpkNtDr.jpg/, data['author_picture'] - assert_match //, data['html'] - assert_match 'Caio Almeida', data['author_name'] - assert_match /twitter.com\/TEDTalks/, data['author_url'] - assert_not_nil data['published_at'] - - assert_nil data['error'] - end - - test "should store data of post returned by twitter API" do - Twitter::REST::Client.any_instance.stubs(:status).returns(fake_tweet) - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) - - data = Parser::TwitterItem.new('https://twitter.com/fakeaccount/status/123456789').parse_data(empty_doc) - - assert data['raw']['api'].is_a? Hash - assert !data['raw']['api'].empty? - end + test "it makes a get request to the tweet lookup endpoint successfully" do + stub_configs({'twitter_bearer_token' => 'test' }) + + WebMock.stub_request(:get, "https://api.twitter.com/2/tweets") + .with(query: query) + .with(headers: { "Authorization": "Bearer test" }) + .to_return(status: 200, body: response_fixture_from_file('twitter-item-response-success.json')) - # I'm not confident this is testing anything about HTML decoding as written - test "should decode html entities" do - tweet = Twitter::Tweet.new( - id: "123", - text: " [update] between Calistoga and Santa Rosa (Napa & Sonoma County) is now 35,270 acres and 44% contained. " - ) - Twitter::REST::Client.any_instance.stubs(:status).returns(tweet) - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) - - data = Parser::TwitterItem.new('https://twitter.com/fakeaccount/status/123456789').parse_data(empty_doc) - assert_no_match /&/, data['title'] + data = Parser::TwitterItem.new('https://m.twitter.com/fake_user/status/1111111111111111111').parse_data(empty_doc) + + assert_equal '1111111111111111111', data['external_id'] + assert_equal '@fake_user', data['username'] + assert_not_nil data['picture'] end - test "should throw Pender::Exception::ApiLimitReached when Twitter::Error::TooManyRequests is thrown when parsing tweet" do - Twitter::REST::Client.any_instance.stubs(:status).raises(Twitter::Error::TooManyRequests) - - assert_raises Pender::Exception::ApiLimitReached do - Parser::TwitterItem.new('https://twitter.com/fake-account/status/123456789').parse_data(empty_doc) - end - end + test "it makes a get request to the tweet lookup endpoint, endpoint and notifies sentry when 404 status is returned" do + stub_configs({'twitter_bearer_token' => 'test' }) - test "logs error resulting from non-ratelimit tweet lookup, and return default values with html blank" do - Twitter::REST::Client.any_instance.stubs(:status).raises(Twitter::Error::NotFound) - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) + WebMock.stub_request(:get, "https://api.twitter.com/2/tweets") + .with(query: query) + .with(headers: { "Authorization": "Bearer test" }) + .to_return(status: 404, body: response_fixture_from_file('twitter-item-response-error.json')) - data = {} sentry_call_count = 0 arguments_checker = Proc.new do |e| sentry_call_count += 1 - assert_equal Twitter::Error::NotFound, e.class end - + PenderSentry.stub(:notify, arguments_checker) do - data = Parser::TwitterItem.new('https://twitter.com/fake-account/status/123456789').parse_data(empty_doc) + data = Parser::TwitterItem.new('https://twitter.com/fake_user/status/1111111111111111111').parse_data(empty_doc) assert_equal 1, sentry_call_count - end - assert_match /Twitter::Error::NotFound/, data['error']['message'] - assert_equal "123456789", data['external_id'] - assert_equal "@fake-account", data['username'] - assert data['html'].empty? + assert_not_nil data['error'] + assert_match /404/, data['error'][0]['title'] + assert_match /Not Found Error/, data['error'][0]['detail'] + end end + + test "it makes a get request to the tweet lookup endpoint, notifies sentry notifies sentry when timeout occurs" do + stub_configs({'twitter_bearer_token' => 'test' }) - # This swallows rate limiting errors, which we're surfacing in a different - # exception catching block in the same class. It also doesn't surface errors. - # We may want to reconsider both of these things for consistency. - test "logs error resulting from looking up user information, and returns tweet info" do - Twitter::REST::Client.any_instance.stubs(:status).returns(fake_tweet) - Twitter::REST::Client.any_instance.stubs(:user).raises(Twitter::Error) + WebMock.stub_request(:get, "https://api.twitter.com/2/tweets") + .with(query: query) + .with(headers: { "Authorization": "Bearer test" }) + .to_raise(Errno::EHOSTUNREACH) - data = {} sentry_call_count = 0 arguments_checker = Proc.new do |e| sentry_call_count += 1 - assert_equal Twitter::Error, e.class end - + PenderSentry.stub(:notify, arguments_checker) do - data = Parser::TwitterItem.new('https://twitter.com/fakeaccount/status/123456789').parse_data(empty_doc) + data = Parser::TwitterItem.new('https://twitter.com/fake_user/status/1111111111111111111').parse_data(empty_doc) assert_equal 1, sentry_call_count - end - assert_nil data['error'] - assert_equal "123456789", data['external_id'] - assert_equal "@fakeaccount", data['username'] - assert_match /I'll be talking in @rubyconfbr this year!/, data['title'] + assert_not_nil data['error'] + assert_match /No route to host/, data['error'][0]['title'] + assert_nil data['error'][0]['detail'] + end end - # This is current behavior, but I wonder if we might want something like https://twitter.com/fakeaccount - test "falls back to top_url when user information can't be retrieved" do - Twitter::REST::Client.any_instance.stubs(:status).returns(fake_tweet) - Twitter::REST::Client.any_instance.stubs(:user).raises(Twitter::Error) + test "sets the author_url o be https://twitter.com/even if an error is returned" do + stub_tweet_lookup.returns(twitter_item_response_error) - data = Parser::TwitterItem.new('https://twitter.com/fakeaccount/status/123456789').parse_data(empty_doc) - assert_nil data['error'] - assert_equal 'https://twitter.com', data['author_url'] + data = Parser::TwitterItem.new('https://twitter.com/fake_user/status/1111111111111111111').parse_data(empty_doc) + + assert_not_nil data['error'] + assert_equal 'https://twitter.com/fake_user', data['author_url'] end + test "should store data of post returned by twitter API" do + stub_tweet_lookup.returns(twitter_item_response_success) + + data = Parser::TwitterItem.new('https://twitter.com/fake_user/status/1111111111111111111').parse_data(empty_doc) + + assert data['raw']['api'].is_a? Hash + assert !data['raw']['api'].empty? + end + test "should remove line breaks from Twitter item title" do - tweet = Twitter::Tweet.new( - id: '123', - text: "LA Times- USC Dornsife Sunday Poll: \n Donald Trump Retains 2 Point \n Lead Over Hillary" - ) - Twitter::REST::Client.any_instance.stubs(:status).returns(tweet) - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) - - data = Parser::TwitterItem.new('https://twitter.com/fake-account/status/123456789').parse_data(empty_doc) - assert_match 'LA Times- USC Dornsife Sunday Poll: Donald Trump Retains 2 Point Lead Over Hillary', data['title'] + stub_tweet_lookup.returns(twitter_item_response_success) + + data = Parser::TwitterItem.new('https://twitter.com/fake_user/status/1111111111111111111').parse_data(empty_doc) + + assert_match 'Youths! Webb observed galaxy cluster El Gordo', data['title'] end test "should parse tweet url with special chars, and strip them" do - Twitter::REST::Client.any_instance.stubs(:status).returns(fake_tweet) - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) + stub_tweet_lookup.returns(twitter_item_response_success) - parser = Parser::TwitterItem.new('https://twitter.com/#!/salmaeldaly/status/45532711472992256') - data = parser.parse_data(empty_doc) + parser = Parser::TwitterItem.new('https://twitter.com/#!/fake_user/status/1111111111111111111') + parser.parse_data(empty_doc) - assert_match 'https://twitter.com/salmaeldaly/status/45532711472992256', parser.url + assert_match 'https://twitter.com/fake_user/status/1111111111111111111', parser.url - parser = Parser::TwitterItem.new('https://twitter.com/%23!/salmaeldaly/status/45532711472992256') - data = parser.parse_data(empty_doc) + parser = Parser::TwitterItem.new('https://twitter.com/%23!/fake_user/status/1111111111111111111') + parser.parse_data(empty_doc) - assert_match 'https://twitter.com/salmaeldaly/status/45532711472992256', parser.url - end - - # I'm not confident this is testing anything about truncation as written - test "should get all information of a truncated tweet" do - tweet = Twitter::Tweet.new( - id: "123", - full_text: "Anti immigrant graffiti in a portajon on a residential construction site in Mtn Brook, AL. Job has about 50% Latino workers. https://t.co/bS5vI4Jq7I", - truncated: true, - entities: { - media: [ - { media_url_https: "https://pbs.twimg.com/media/C7dYir1VMAAi46b.jpg" } - ] - } - ) - Twitter::REST::Client.any_instance.stubs(:status).returns(tweet) - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) - - data = Parser::TwitterItem.new('https://twitter.com/fake-account/status/123456789').parse_data(nil) - - assert_equal 'https://pbs.twimg.com/media/C7dYir1VMAAi46b.jpg', data['picture'] + assert_match 'https://twitter.com/fake_user/status/1111111111111111111', parser.url end test "#oembed_url returns URL with the instance URL" do oembed_url = Parser::TwitterItem.new('https://twitter.com/fake-account/status/1234').oembed_url assert_equal 'https://publish.twitter.com/oembed?url=https://twitter.com/fake-account/status/1234', oembed_url end + + test "should parse valid link with spaces" do + stub_tweet_lookup.returns(twitter_item_response_success) + + data = Parser::TwitterItem.new(' https://twitter.com/fake_user/status/1111111111111111111').parse_data(empty_doc) + + assert_match 'Youths! Webb observed galaxy cluster El Gordo', data['title'] + end + + test "should fill in html when html parsing fails but API works" do + stub_tweet_lookup.returns(twitter_item_response_success) + + data = Parser::TwitterItem.new('https://twitter.com/fake_user/status/1111111111111111111').parse_data(empty_doc) + + assert_match " 'test' }) + + WebMock.stub_request(:get, "https://api.twitter.com/2/users/by") + .with(query: query) + .with(headers: { "Authorization": "Bearer test" }) + .to_return(status: 200, body: response_fixture_from_file('twitter-profile-response-success.json')) + + data = Parser::TwitterProfile.new('https://m.twitter.com/fake_user').parse_data(empty_doc) + + assert_equal '2009-04-07T15:40:56.000Z', data['published_at'] + assert_equal '@fake_user', data['username'] + end + + test "it makes a get request to the user lookup by username endpoint and notifies sentry when 404 status is returned" do + stub_configs({'twitter_bearer_token' => 'test' }) + + WebMock.stub_request(:get, "https://api.twitter.com/2/users/by") + .with(query: query) + .with(headers: { "Authorization": "Bearer test" }) + .to_return(status: 404, body: response_fixture_from_file('twitter-profile-response-error.json')) - data = {} sentry_call_count = 0 arguments_checker = Proc.new do |e| sentry_call_count += 1 - assert_equal Twitter::Error, e.class end - + PenderSentry.stub(:notify, arguments_checker) do - data = Parser::TwitterProfile.new('https://www.twitter.com/fakeaccount').parse_data(nil) + data = Parser::TwitterProfile.new('https://twitter.com/fake_user').parse_data(empty_doc) assert_equal 1, sentry_call_count + assert_not_nil data['error'] + assert_match /404/, data['error'][0]['title'] + assert_match /Not Found Error/, data['error'][0]['detail'] end - assert_match /Twitter::Error/, data['error']['message'] - assert_equal "fakeaccount", data['title'] - assert_equal "https://twitter.com/fakeaccount", data['url'] - assert_equal "fakeaccount", data['external_id'] - assert_equal "@fakeaccount", data['username'] - assert_equal "fakeaccount", data['author_name'] end - test "assigns values to hash from the API response" do - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) + test "it makes a get request to the user lookup by username endpoint, notifies sentry when timeout occurs" do + stub_configs({'twitter_bearer_token' => 'test' }) - data = Parser::TwitterProfile.new('https://www.twitter.com/fakeaccount').parse_data(empty_doc) + WebMock.stub_request(:get, "https://api.twitter.com/2/users/by") + .with(query: query) + .with(headers: { "Authorization": "Bearer test" }) + .to_raise(Errno::EHOSTUNREACH) - assert_equal 'fakeaccount', data['external_id'] - assert_equal '@fakeaccount', data['username'] - assert_match /TED is a nonprofit devoted to spreading ideas/, data['description'] - assert_match 'TED Talks', data['title'] - assert_match 'TED Talks', data['author_name'] + sentry_call_count = 0 + arguments_checker = Proc.new do |e| + sentry_call_count += 1 + end + + PenderSentry.stub(:notify, arguments_checker) do + data = Parser::TwitterProfile.new('https://twitter.com/fake_user').parse_data(empty_doc) + assert_equal 1, sentry_call_count + assert_not_nil data['error'] + assert_match /No route to host/, data['error'][0]['title'] + assert_nil data['error'][0]['detail'] + end + end - assert_match 'https://twitter.com/fakeaccount', data['url'] - assert_match /pbs.twimg.com\/profile_images\/877631054525472768\/Xp5FAPD5.jpg/, data['picture'] - assert_match /pbs.twimg.com\/profile_images\/877631054525472768\/Xp5FAPD5.jpg/, data['author_picture'] - assert_not_nil data['published_at'] + test "returns data even if an error is returned" do + stub_profile_lookup.returns(twitter_profile_response_error) - assert_nil data['error'] + data = Parser::TwitterProfile.new('https://twitter.com/fake_user').parse_data(empty_doc) + + assert_not_nil data['error'] + assert_equal 'fake_user', data['external_id'] + assert_equal 'https://twitter.com/fake_user', data['url'] end + test "assigns values to hash from the API response" do + stub_profile_lookup.returns(twitter_profile_response_success) + + data = Parser::TwitterProfile.new('https://www.twitter.com/fake_user').parse_data(empty_doc) + + assert_equal 'fake_user', data['external_id'] + assert_equal '@fake_user', data['username'] + assert_match /The world's most powerful space telescope/, data['description'] + assert_match 'fake_user', data['title'] + assert_match 'Fake User', data['author_name'] + assert_match 'https://twitter.com/fake_user', data['url'] + assert_match /pbs.twimg.com\/profile_images\/685182791496134658\/Wmyak8D6.jpg/, data['picture'] + assert_match /pbs.twimg.com\/profile_images\/685182791496134658\/Wmyak8D6.jpg/, data['author_picture'] + assert_not_nil data['published_at'] + assert_nil data['error'] + end + test "should store raw data of profile returned by Twitter API" do - Twitter::REST::Client.any_instance.stubs(:user).returns(fake_twitter_user) - - data = Parser::TwitterProfile.new('https://www.twitter.com/fakeaccount').parse_data(empty_doc) - + stub_profile_lookup.returns(twitter_profile_response_success) + + data = Parser::TwitterProfile.new('https://www.twitter.com/fake_user').parse_data(empty_doc) + assert_not_nil data['raw']['api'] assert !data['raw']['api'].empty? + end + + test "should remove line breaks from Twitter profile description" do + stub_profile_lookup.returns(twitter_profile_response_success) + + data = Parser::TwitterProfile.new('https://twitter.com/fake_user').parse_data(empty_doc) + + assert_match "Launched: Dec. 25, 2021. First images revealed: July 12, 2022. Verification: https://t.co/ChOEslj1j5", data['description'] end - test "should throw Pender::Exception::ApiLimitReached when Twitter::Error::TooManyRequests is thrown" do - Twitter::REST::Client.any_instance.stubs(:user).raises(Twitter::Error::TooManyRequests) + test "should parse tweet url with special chars, and strip them" do + stub_profile_lookup.returns(twitter_profile_response_success) - assert_raises Pender::Exception::ApiLimitReached do - Parser::TwitterProfile.new('https://twitter.com/fake-account').parse_data(empty_doc) - end + parser = Parser::TwitterProfile.new('https://0.twitter.com/fake_user') + parser.parse_data(empty_doc) + + assert_match 'https://twitter.com/fake_user', parser.url + + parser = Parser::TwitterProfile.new('https://m.twitter.com/fake_user') + parser.parse_data(empty_doc) + + assert_match 'https://twitter.com/fake_user', parser.url + + parser = Parser::TwitterProfile.new('https://mobile.twitter.com/fake_user') + parser.parse_data(empty_doc) + + assert_match 'https://twitter.com/fake_user', parser.url + end + + test "should parse valid link with spaces" do + stub_profile_lookup.returns(twitter_profile_response_success) + + data = Parser::TwitterProfile.new(' https://twitter.com/fake_user').parse_data(empty_doc) + + assert_match "Launched: Dec. 25, 2021. First images revealed: July 12, 2022. Verification: https://t.co/ChOEslj1j5", data['description'] end test "#oembed_url returns URL with the instance URL" do diff --git a/test/workers/archiver_worker_test.rb b/test/workers/archiver_worker_test.rb index 1943fc05..bc1afc85 100644 --- a/test/workers/archiver_worker_test.rb +++ b/test/workers/archiver_worker_test.rb @@ -3,9 +3,8 @@ class ArchiverWorkerTest < ActiveSupport::TestCase test "should update cache when video archiving fails the max retries" do - skip("twitter api key is not currently working") Metrics.stubs(:schedule_fetching_metrics_from_facebook) - url = 'https://twitter.com/meedan/status/1202732707597307905' + url = 'https://meedan.com/post/annual-report-2022' m = create_media url: url data = m.as_json assert_nil data.dig('archives', 'video_archiver') @@ -18,7 +17,6 @@ class ArchiverWorkerTest < ActiveSupport::TestCase end test "should update cache when Archive.org fails the max retries" do - skip("twitter api key is not currently working") Media.any_instance.unstub(:archive_to_archive_org) WebMock.enable! allowed_sites = lambda{ |uri| uri.host != 'web.archive.org' } @@ -29,7 +27,7 @@ class ArchiverWorkerTest < ActiveSupport::TestCase WebMock.stub_request(:post, /example.com\/webhook/).to_return(status: 200, body: '') a = create_api_key application_settings: { 'webhook_url': 'https://example.com/webhook.php', 'webhook_token': 'test' } - url = 'https://twitter.com/marcouza/status/875424957613920256' + url = 'https://meedan.com/post/annual-report-2022' m = create_media url: url, key: a assert_raises Pender::Exception::RetryLater do data = m.as_json(archivers: 'archive_org') @@ -45,7 +43,6 @@ class ArchiverWorkerTest < ActiveSupport::TestCase end test "should update cache when Archive.org raises since first attempt" do - skip("twitter api key is not currently working") Media.any_instance.unstub(:archive_to_archive_org) WebMock.enable! allowed_sites = lambda{ |uri| uri.host != 'web.archive.org' } @@ -54,7 +51,7 @@ class ArchiverWorkerTest < ActiveSupport::TestCase WebMock.stub_request(:post, /example.com\/webhook/).to_return(status: 200, body: '') a = create_api_key application_settings: { 'webhook_url': 'https://example.com/webhook.php', 'webhook_token': 'test' } - url = 'https://twitter.com/marcouza/status/875424957613920256' + url = 'https://meedan.com/post/annual-report-2022' m = create_media url: url, key: a assert_raises Pender::Exception::RetryLater do