-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parser for twitter search URLs #420
Add parser for twitter search URLs #420
Conversation
8d72005
to
77334de
Compare
Add parser for twitter search URLs. Currently, these URLs are failing and throwing errors. Related to: CV2-3708
77334de
to
bd72351
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work on your first feature PR, @jayjay-w ! :)
I tested locally and works great! I made some suggestions and a couple of requests :)
class TwitterSearchItem < Base | ||
include ProviderTwitter | ||
|
||
TWITTER_SEARCH_ITEM_URL = /https:\/\/twitter\.com\/search[^\s"]*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Request: I suggest we make this regular expression broader:
- We should always expect at least the
q
parameter - We should support HTTP
- We need to be sure it starts with http
- We should support other subdomains (www, mobile, etc.)
That said, I suggest to use the host part we have here: https://github.com/meedan/pender/blob/develop/app/models/parser/twitter_item.rb#L5
Suggestion: We could extract the host part of that regular expression into a TWITTER_HOST_URL
constant defined in ProviderTwitter
and reused by the other Twitter modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, these were great suggestions. I have implemented everything except the last suggestion, which I will move later.
title: search_term, | ||
search_term: search_term, | ||
src: src, | ||
description: "Twitter search for #{search_term}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Request: Testing locally, I got two items with different URLs to match because of this description, so I suggest we keep it simple and just return the URL (unescaped, for better readability) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done as well. The description is now just the decoded URL.
# Main function for class | ||
def parse_data_for_parser(_doc, _original_url, _jsonld_array) | ||
handle_exceptions(StandardError) do | ||
params = URI.parse(url).query.split('&').map { |pair| pair.split('=') }.to_h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add reviewer feedback: - Escape URLs sent to parser - Handle different domains (m.twitter.com and mobile.twitter.com) - Move src to api
Code Climate has analyzed commit 4b1e2a9 and detected 0 issues on this pull request. The test coverage on the diff in this pull request is 100.0% (50% is the threshold). This pull request will bring the total coverage in the repository to 98.6% (0.0% change). View more on Code Climate. |
@caiosba this is ready for another review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, @jayjay-w ! Thanks for addressing the comments from the previous review. I also tested it locally and it works well.
Thank you! |
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
Description
Add parser for twitter search URLs. Currently, these URLs are failing and throwing errors.
Related to: CV2-3708
How has this been tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can verify the changes. Please describe whether or not you implemented automated tests.
Things to pay attention to during code review
Please describe parts of the change that require extra attention during code review, for example:
Checklist