Skip to content

Commit

Permalink
update twitter item tests (and postgres update)
Browse files Browse the repository at this point in the history
- We don't need that many integration tests, so we are moving some of them
back to test/models/parser/twitter_item_test.rb, and updating them to not
make live requests, instead they will be stubbed.

- I added tests to check the basic request functionality that we now have.
And removed "assigns values to hash from the API response" since that is already
being tested in "it makes a get request to the tweet lookup endpoint successfully"

- "should decode html entities" was removed because that happens
inside Media and is not done by the individual parser, which means
the test actually fails (as it should)

- fake_tweet and fake_twitter_user were removed, since they used
methods from the old Twitter gem. Now we are stubbing a response from
our new method: tweet_lookup

- added .squish to parsed_data['raw']['api']['data'][0]['text'] to clean up
line breaks from title and description. Our test was failling because it was
not being removed. also since title and description are the same, I just
set the description to be the same as the title instead of parsing twice.

- separated the stub from the response, so we can also have a failed response.
changed the response fixture to be a success one, and added an error one

- changed the id and user to make it clear that those are fake and being stubbed.

- Removed the test for truncated text, that behavior is no longer present
in the v2 api, only retweets might be truncated (we don't fetch those),
and the way to deal with it is different. It does not take truncated as a
query param.

- @url.gsub!(/\s/, '') -> remove whitespaces from the url

- raise ApiError.new("#{e.class}: #{e.message}") -> I can get the response
code and body, but I get an error when I try the same for the error

- upgrade postgres image to 13 (#373)

I had upgraded to postgres12-bullseye because of a issue we had when
building on Travis:
We had an issue with building on Travis that seems related to a change
the maintainers of the postgres docker images have made to the underlying
OS image layer: Previous: Debian 11 (bullseye) New: Debian 12 (bookworm).
The workaround seems to be using postgres-bullseye.More on this here
https://stackoverflow.com/questions/76555305/postgres-container-failed-to-start-with-initdb-error-popen-failure-cannot-allo/76591040#76591040

Now updating to 13 I checked which image Devin used in Alegre and am
using the same one here.
  • Loading branch information
vasconsaurus committed Aug 8, 2023
1 parent 54a5b73 commit 3d62936
Show file tree
Hide file tree
Showing 7 changed files with 123 additions and 259 deletions.
2 changes: 1 addition & 1 deletion app/models/concerns/provider_twitter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,8 @@ def get(path, params)
raise ApiResponseCodeError.new("#{response.class}: #{response.code} #{response.message} - #{response.body}") unless response.code.to_i < 400
JSON.parse(response.body)
rescue StandardError => e
raise ApiError.new("#{e.class}: #{e.code} #{e.message} - #{e.body}")
PenderSentry.notify(e, url: url)
raise ApiError.new("#{e.class}: #{e.message}")
end
end

Expand Down
14 changes: 8 additions & 6 deletions app/models/parser/twitter_item.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,10 @@ def patterns
# Main function for class
def parse_data_for_parser(_doc, _original_url, _jsonld_array)
@url.gsub!(/(%23|#)!\//, '')
@url.gsub!(/\s/, '')
@url = replace_subdomain_pattern(url)
parts = url.match(TWITTER_ITEM_URL)

user, id = parts['user'], parts['id']

@parsed_data['raw']['api'] = {}
Expand All @@ -36,16 +38,16 @@ def parse_data_for_parser(_doc, _original_url, _jsonld_array)
published_at = ''
html = ''
author_name = user
author_url = get_author_url(url, user) || RequestHelper.top_url(url)
author_url = get_author_url(user)
elsif @parsed_data[:error].nil?
title = parsed_data['raw']['api']['data'][0]['text']
description = parsed_data['raw']['api']['data'][0]['text']
title = parsed_data['raw']['api']['data'][0]['text'].squish
description = title
picture = get_twitter_item_picture(parsed_data)
author_picture = parsed_data['raw']['api']['includes']['users'][0]['profile_image_url'].gsub('_normal', '')
published_at = parsed_data['raw']['api']['data'][0]['created_at']
html = html_for_twitter_item(url)
author_name = parsed_data['raw']['api']['includes']['users'][0]['name']
author_url = get_author_url(url, user) || parsed_data['raw']['api']['includes']['users'][0]['url'] || RequestHelper.top_url(url)
author_url = get_author_url(user) || parsed_data['raw']['api']['includes']['users'][0]['url'] || RequestHelper.top_url(url)
end

@parsed_data.merge!({
Expand All @@ -63,8 +65,8 @@ def parse_data_for_parser(_doc, _original_url, _jsonld_array)
parsed_data
end

def get_author_url(url, user)
URI(url).host + '/' + user
def get_author_url(user)
'https://twitter.com/' + user
end

def get_twitter_item_picture(parsed_data)
Expand Down
4 changes: 2 additions & 2 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ version: "2.2"
volumes:
redis:
minio:
postgres12:
postgres13:
services:
redis:
image: redis:5
Expand All @@ -21,7 +21,7 @@ services:
MINIO_ACCESS_KEY: AKIAIOSFODNN7EXAMPLE
MINIO_SECRET_KEY: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
postgres:
image: postgres:12-bullseye
image: postgres:13-buster
ports:
- "5432:5432"
environment:
Expand Down
13 changes: 13 additions & 0 deletions test/data/twitter-item-response-error.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"errors": [
{
"value": "1111111111111111111",
"detail": "Could not find tweet with ids: [1111111111111111111].",
"title": "Not Found Error",
"resource_type": "tweet",
"parameter": "ids",
"resource_id": "1111111111111111111",
"type": "https://api.twitter.com/2/problems/resource-not-found"
}
]
}
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
{
"data": [
{
"id": "1686748612506632192",
"id": "1111111111111111111",
"edit_history_tweet_ids": [
"1686748612506632192"
"1111111111111111111"
],
"attachments": {
"media_keys": [
Expand All @@ -25,7 +25,7 @@
],
"users": [
{
"username": "NASAWebb",
"username": "fake_user",
"url": "https://t.co/ZpTf8zeokA",
"name": "NASA Webb Telescope",
"profile_image_url": "https://pbs.twimg.com/profile_images/685182791496134658/Wmyak8D6_normal.jpg",
Expand Down
40 changes: 0 additions & 40 deletions test/integration/parsers/twitter_item_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,45 +11,5 @@ class TwitterItemIntegrationTest < ActiveSupport::TestCase
assert_nil data['picture']
assert_not_nil data['author_picture']
end

test "should parse valid link with spaces" do
# skip("twitter api key is not currently working")
m = create_media url: ' https://twitter.com/caiosba/status/742779467521773568 '
data = m.as_json
assert_match 'I\'ll be talking in @rubyconfbr this year! More details soon...', data['title']
assert_match 'Caio Almeida', data['author_name']
assert_match '@caiosba', data['username']
assert_nil data['picture']
assert_not_nil data['author_picture']
end

test "should fill in html when html parsing fails but API works" do
# skip("twitter api key is not currently working")
url = 'https://twitter.com/codinghorror/status/1276934067015974912'
OpenURI.stubs(:open_uri).raises(OpenURI::HTTPError.new('','429 Too Many Requests'))
m = create_media url: url
data = m.as_json
assert_match /twitter-tweet.*#{url}/, data[:html]
end

test "should not parse a twitter post when passing the twitter api bearer token is missing" do
# skip("this might be broke befcause of twitter api changes - needs fixing")
key = create_api_key application_settings: { config: { twitter_bearer_token: '' } }
m = create_media url: 'https://twitter.com/cal_fire/status/919029734847025152', key: key
assert_equal '', PenderConfig.get(:twitter_bearer_token)
data = m.as_json
assert_equal m.url, data['title']
assert_match "401 Unauthorized", data['error']['message']
end

test "should store oembed data of a twitter profile" do
# skip("twitter api key is not currently working")
m = create_media url: 'https://twitter.com/meedan'
data = m.as_json

assert data['raw']['oembed'].is_a? Hash
assert_equal "https:\/\/twitter.com", data['raw']['oembed']['provider_url']
assert_equal "Twitter", data['raw']['oembed']['provider_name']
end
end

Loading

0 comments on commit 3d62936

Please sign in to comment.