Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NoMethodError in tiktok parser (sentry / audit hour) #367

Merged
merged 11 commits into from
Jul 24, 2023

Conversation

vasconsaurus
Copy link
Contributor

@vasconsaurus vasconsaurus commented Jul 17, 2023

Description

When looking into this issue on Sentry, we found that we had two issues here. On some events the page didn't exist, on others it wasn't either a profile nor an item, but a tags result page.

For the page that does not exist: adding a safe operator inside the parser seems to be enough. Now instead of getting the NoMethodError we get URL Not Found, which I think is a better error here.

For the tag page: I added some logic to deal with that in tiktok_item, and added an @ to the regex inside profile, to make sure this goes through the item parser.

References: cv2-3124

How has this been tested?

Manually and through tiktok's tests. I also updated the test ("matches known URL patterns, and returns instance on success") to test for the new tag pattern.

@vasconsaurus vasconsaurus force-pushed the fix-3124-tiktok_parser-undefined_method branch from 3a12723 to 675c2e6 Compare July 17, 2023 16:47
Copy link
Contributor

@hartsick hartsick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I left one small comment just for clarity.

I also have one small question - is there any unique or additional parsing we need for the tags page to surface relevant information for tags (vs items), or is the parsing work we already do for Tiktok items enough?

match = url.match(TIKTOK_ITEM_URL)
username = match['username']
external_id = match['id']
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I think an elsif might be a little clearer here, if we expect it to be one or the other

@vasconsaurus
Copy link
Contributor Author

We could return the tag as tag, instead of only in title, or maybe a boolean saying it's a tag page? Username, author and id are empty, since there are none. But I don't think there is a lot of useful information in the tags results page 🤔 . What do you think? It returns something like this:

   "api"=>
    {"version"=>"1.0",
     "type"=>"rich",
     "title"=>"#ball",
     "author_url"=>"",
     "author_name"=>"",
     "width"=>"100%",
     "height"=>"100%",
     "html"=>
      "<blockquote class=\"tiktok-embed\" cite=\"https://www.tiktok.com/tag/ball\" data-tag-id=\"ball\" data-embed-from=\"oembed\" data-embed-type=\"tag\" style=\"max-width: 780px; min-width: 288px;\" > <section> <a target=\"_blank\" href=\"https://www.tiktok.com/tag/ball?refer=hashtag_embed\">#ball</a> </section> </blockquote> <script async src=\"https://www.tiktok.com/embed.js\"></script>",
     "provider_url"=>"https://www.tiktok.com",
     "provider_name"=>"TikTok",
     "embed_product_id"=>"ball",
     "embed_type"=>"hashtag"}},
 "description"=>"#ball",
 "username"=>"",
 "external_id"=>"",
 "title"=>"#ball",
 "picture"=>nil,
 "author_url"=>"",
 "html"=>
  "<blockquote class=\"tiktok-embed\" cite=\"https://www.tiktok.com/tag/ball\" data-tag-id=\"ball\" data-embed-from=\"oembed\" data-embed-type=\"tag\" style=\"max-width: 780px; min-width: 288px;\" > <section> <a target=\"_blank\" href=\"https://www.tiktok.com/tag/ball?refer=hashtag_embed\">#ball</a> </section> </blockquote> <script async src=\"https://www.tiktok.com/embed.js\"></script>",
 "author_name"=>"",
 "tag"=>"ball"}

@vasconsaurus vasconsaurus marked this pull request as ready for review July 19, 2023 11:30
@vasconsaurus vasconsaurus changed the title (wip) NoMethodError in tiktok parser (sentry / audit hour) NoMethodError in tiktok parser (sentry / audit hour) Jul 19, 2023
username: match['username'],
external_id: match['id'],
username: username,
external_id: external_id,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

username , external_id or tag will be undefined in case there is no match for TIKTOK_ITEM_URL or TIKTOK_ITEM_URL so I suggest to define username, external_id and tag with empty strings outside the if..else block (i.e. before line 26) and fill the right values inside if ... elseif ... end block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an 'else' in case there is no match and tested with an url. What do you think?

@caiosba caiosba removed their request for review July 19, 2023 11:44
if there isn't a match to TIKTOK_TAG_URL or TIKTOK_ITEM_URL username,
external_id and tag are still defined as empty strings
@hartsick
Copy link
Contributor

hartsick commented Jul 19, 2023

@vasconsaurus thanks for providing that response! super helpful

We could return the tag as tag, instead of only in title, or maybe a boolean saying it's a tag page? Username, author and id are empty, since there are none. But I don't think there is a lot of useful information in the tags results page 🤔 . What do you think? It returns something like this:

   "api"=>
    {"version"=>"1.0",
     "type"=>"rich",
     "title"=>"#ball",
     "author_url"=>"",
     "author_name"=>"",
     "width"=>"100%",
     "height"=>"100%",
     "html"=>
      "<blockquote class=\"tiktok-embed\" cite=\"https://www.tiktok.com/tag/ball\" data-tag-id=\"ball\" data-embed-from=\"oembed\" data-embed-type=\"tag\" style=\"max-width: 780px; min-width: 288px;\" > <section> <a target=\"_blank\" href=\"https://www.tiktok.com/tag/ball?refer=hashtag_embed\">#ball</a> </section> </blockquote> <script async src=\"https://www.tiktok.com/embed.js\"></script>",
     "provider_url"=>"https://www.tiktok.com",
     "provider_name"=>"TikTok",
     "embed_product_id"=>"ball",
     "embed_type"=>"hashtag"}},
 "description"=>"#ball",
 "username"=>"",
 "external_id"=>"",
 "title"=>"#ball",
 "picture"=>nil,
 "author_url"=>"",
 "html"=>
  "<blockquote class=\"tiktok-embed\" cite=\"https://www.tiktok.com/tag/ball\" data-tag-id=\"ball\" data-embed-from=\"oembed\" data-embed-type=\"tag\" style=\"max-width: 780px; min-width: 288px;\" > <section> <a target=\"_blank\" href=\"https://www.tiktok.com/tag/ball?refer=hashtag_embed\">#ball</a> </section> </blockquote> <script async src=\"https://www.tiktok.com/embed.js\"></script>",
 "author_name"=>"",
 "tag"=>"ball"}

I think that we shouldn't return tag as a new top-level key from the API, since it would be specific to TikTok items and the frontend won't do anything with it by default. our top-level keys should be consistent for all parsers in the API response, so the front end doesn't have to do anything special to display items from different parsers.

I think we can work in the keys we have, though! right now the title is a bit confusing since it's just #ball, so my suggestion is if we see that we've gotten a tag page let's modify the title to be Tag <original-title> (in this case it would be Tag #ball) or something similar. what do you think?

@hartsick hartsick self-requested a review July 19, 2023 19:40
test/models/parser/tiktok_item_test.rb Dismissed Show dismissed Hide dismissed
@vasconsaurus
Copy link
Contributor Author

@hartsick @melsawy please take a look this one when you have the time ☺️

Copy link
Contributor

@hartsick hartsick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add one test to reflect the new title setting behavior. otherwise looks good!

elsif url.match(TIKTOK_TAG_URL)
username = ''
external_id = ''
title = "Tag: " + url.match(TIKTOK_TAG_URL)['tag']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a test here? seems like good behavior to capture. think we already catch the other cases through our existing tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hartsick I tried adding two tests one to test if the 'Tag: ' is added, and another to test if external_id and username are empty strings for unmatched URL pattern.

The one for the tag seems to work, but the other one doesn't.

At first I tried writing something like this

test "sets 'external_id' and 'username' as empty string for unmatched URL pattern" do
    WebMock.stub_request(:get, 'https://www.tiktok.com/search?q=abcdef').to_return(status: 200, body: response_fixture_from_file('tiktok-item-page.html'))
    
    data = Parser::TiktokItem.new('https://www.tiktok.com/search?q=abcdef').parse_data(doc)

    assert_equal '', data['title']
    assert_equal '', data['username']
end

I got an error that told me to use the oembed. I changed it, and it passed.
But it failed on Travis, so I ran it again locally, and it's currently failing as well.
I don't know why it passed the first couple tries.

What is the correct way to write this? Should we be testing for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just figured out the issue, for some reason I replaced external_id with title 🤦‍♀️ It should work now.

@vasconsaurus vasconsaurus force-pushed the fix-3124-tiktok_parser-undefined_method branch from ee48307 to a32bb63 Compare July 24, 2023 16:57
Copy link
Contributor

@hartsick hartsick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

faaaaantastic 🎉

@codeclimate
Copy link

codeclimate bot commented Jul 24, 2023

Code Climate has analyzed commit a32bb63 and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (50% is the threshold).

This pull request will bring the total coverage in the repository to 98.0% (0.0% change).

View more on Code Climate.

@vasconsaurus vasconsaurus dismissed melsawy’s stale review July 24, 2023 18:47

The changes have been made

@vasconsaurus vasconsaurus merged commit 0d6eda6 into develop Jul 24, 2023
4 checks passed
@vasconsaurus vasconsaurus deleted the fix-3124-tiktok_parser-undefined_method branch July 24, 2023 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants