-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CALCITE-5825] Add URL_ENCODE and URL_DECODE function (enabled in Spark library) #3318
Conversation
Kudos, SonarCloud Quality Gate passed! |
@NobiGo would you like to help review if you have time? Thanks very much. |
a1eda17
to
39dee84
Compare
import static org.apache.calcite.util.Static.RESOURCE; | ||
|
||
/** | ||
* A collection of functions used in Url processing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is UrlFunctions only available in Spark?
If so, I think the comments could be more detailed in the description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an util class for all the general Url processing funcitons, not only for Spark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it
private static final Charset UTF_8 = StandardCharsets.UTF_8; | ||
|
||
/** The "URL_DECODE(string)" function. */ | ||
public static String urlDecode(String value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the value is null or empty string "",What is the correct behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this. The correct behavior is to return empty string. I will add two unit tests to cover it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For null check, the ARG0
of NullPolicy
will do null check in code generation phase, so I think there is no need to implement null check in each function implementation additionally.
String url; | ||
try { | ||
url = URLDecoder.decode(value, UTF_8.name()); | ||
} catch (UnsupportedEncodingException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides UnsupportedEncodingException, are there any other exceptions to handle?
Like some RuntimeException,such as NullPointerException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for remind. RuntimeException
does not need to explicitly throw, so easy to forget to handle it. I check the RuntimeException
in URLDecoder.decode
, there will be IllegalArgumentException
when encounter the unencoded %
.
The next I will do are two things:
- Keep current exception handling way. Since
IllegalArgumentException
exception messages are very diverse, I think that not handling with it explicitly here is reasonable, and just throwsIllegalArgumentException
when it encountered. - Add exception test cases to cover
IllegalArgumentException
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test cases to cover IllegalArgumentException
is added.
} | ||
|
||
/** The "URL_ENCODE(string)" function. */ | ||
public static String urlEncode(String url) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be RuntimeException
, including NullPointerException
/IllegalCharsetNameException
/UnsupportedCharsetException
, since we use UTF-8 charset, and this charset always exists. So these RuntimeException
will never appear. I think we can ignore them.
@herunkang2018 ,thank you for opening this PR, I left a couple of comments,overall looks good. |
@LakeShen Thanks a lot for the review. All comments resolved, please help to review again if you have time. |
@herunkang2018 LGTM. And need the second people to review it:) |
site/_docs/reference.md
Outdated
@@ -2854,6 +2854,8 @@ BigQuery's type system uses confusingly different names for types and functions: | |||
| b | UNIX_MILLIS(timestamp) | Returns the number of milliseconds since 1970-01-01 00:00:00 | |||
| b | UNIX_SECONDS(timestamp) | Returns the number of seconds since 1970-01-01 00:00:00 | |||
| b | UNIX_DATE(date) | Returns the number of days since 1970-01-01 | |||
| s | URL_DECODE(string) | Decodes a *string* in 'application/x-www-form-urlencoded' format using a specific encoding scheme |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should specify what happens if the function fails to encode/decode. Is the result NULL or a runtime error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for remind, I will check it soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will return original string when decoded error. I have fixed this and added the behaviour to doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks good, would you please squash your commits?
7fbea22
to
d044329
Compare
Kudos, SonarCloud Quality Gate passed! |
@JiajunBernoulli Sure, the commit is squashed. |
No description provided.