-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup for Wrangler SQL Execution Support #690
base: develop
Are you sure you want to change the base?
Conversation
@@ -126,4 +129,11 @@ default List<EntityCountMetric> getCountMetrics() { | |||
// no op | |||
return null; | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add a description for this method similar to getCountMetrics()
wrangler-api/src/main/java/io/cdap/wrangler/api/RelationalDirective.java
Outdated
Show resolved
Hide resolved
5733e64
to
f1d9001
Compare
9889abb
to
9452e30
Compare
9452e30
to
c177602
Compare
@Override | ||
default Relation transform(RelationalTranformContext relationalTranformContext, | ||
Relation relation) { | ||
// no-op |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking just at this class, it seems like every existing directive would become a no-op in SQL, which would be incorrect. It only makes sense when seeing the isSQLSupported() method in the RelationalTransform interface, and that it defaults to false.
I think it would be cleaner if the Directive interface did not extend the RelationalTransform interface, with only directives that do support SQL (like Drop) implementing it. Then the isSQLSupported() method is not needed, Wrangler just checks if the directive is an instance of RelationTransform. This way it is impossible to implement a directive where the transform() and isSQLSupported() methods don't match up.
@Description("Toggle to configure execution language between JEXL and SQL") | ||
@Macro | ||
@Nullable | ||
private String sqlExecution; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a newline after this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to name this something like 'executionMode' and have an internal Enum for it. This is more extensible than a boolean toggle for SQL or no SQL, as it allows us to add new modes in the future if needed. (As a side note, when a property is meant to be a toggle, it should be of Boolean type, not String type).
For example, we may eventually want to add an 'auto' mode that lets the engine choose, or maybe some BigQuery specific SQL mode or something like that.
@@ -646,6 +717,11 @@ public static class Config extends PluginConfig { | |||
static final String NAME_SCHEMA = "schema"; | |||
static final String NAME_ON_ERROR = "on-error"; | |||
|
|||
@Name(NAME_SQL_EXECUTION) | |||
@Description("Toggle to configure execution language between JEXL and SQL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JEXL is a technical term that many users won't understand. It's also not technically correct, as many directives are direct java implementations that don't use jexl scripting. Not sure what would be a better term though. Native? Default?
This PR is responsible for the following:
This includes implementation of sql support of
drop
directive as an example. Once this is merged, the SQL support for other directives can be extended in a similar way