-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v1] Add simplified AST classes and visitors #1579
Conversation
Converting to a draft. From offline discussion, will look into
|
cae268d
to
1a4e2de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I've already brought this up in person, I'm adding some context and links here.
Thinking about the backwards/forwards compatibility, I think it would be a better approach to use integers rather than enums. Some Java libraries have done this, and I've also seen a similar approach from protobuf. Here's an interesting issue that discusses the downsides of using closed enums.
We could potentially have a model that looks something like this:
public interface IdentifierUnqualified {
...
CaseSensitivity getCase();
}
public interface Enum {
int getValue();
}
public interface CaseSensitivity implements Enum {
static const int UNKNOWN = 0;
static const int SENSITIVE = 1;
static const int INSENSITIVE = 2;
}
OR, we could just use the integers directly without the Enum. Though, the enum does provide the possibility for better readability (toString
overriding):
public interface IdentifierUnqualified {
...
/**
* Valid returns are specified as constants in: `CaseSensitivity.CASE_*`
*/
int getCase();
}
public class CaseSensitivity {
static const int CASE_UNKNOWN = 0;
static const int CASE_SENSITIVE = 1;
static const int CASE_INSENSITIVE = 2;
}
This modeling enforces that Java users use the default
branch for switch
statements, protecting us from backward compatibility issues. Also, the UNKNOWN
case helps us in forward compatibility scenarios for deserialization. This is an approach that protobuf takes.
Our internal implementations, following the first model from above, could look like:
// Internal impl file
val id = idUnqualified("some_table", CaseSensitivity.SENSITIVE)
when (id.getCase().getValue()) {
CaseSensitivity.SENSITIVE -> TODO()
CaseSensitivity.INSENSITIVE -> TODO()
else -> error("Unsupported. Or we can also convert to the UNKNOWN variant if need be.")
}
fun idUnqualified(name: String, case: Int): IdentifierUnqualified {
return IdentifierUnqualifiedImpl(name, case)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the dive-deep into enum compatibility. Agree w/ the points that Java's closed enums are an issue and shouldn't be relied upon in our public APIs.
I'm open to using integers in some form. Probably will discuss w/ you offline further on the general API we should use:
- interfaces?
- classes w/ constants?
- do we still need an unknown/other variant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From offline discussion, chose to define an internal Enum
interface within the partiql-ast
package. Enums we which to add will be classes that implement this interface with a default constant of UNKNOWN
set to 0.
There will be a private constructor with public functions to create.
E.g. for Order
:
public class Order implements Enum {
public static final int UNKNOWN = 0;
public static final int ASC = 1;
public static final int DESC = 2;
public static Order UNKNOWN() {
return new Order(UNKNOWN);
}
public static Order ASC() {
return new Order(ASC);
}
public static Order DESC() {
return new Order(DESC);
}
private final int code;
private Order(int code) {
this.code = code;
}
@Override
public int code() {
return code;
}
}
And example client code calling and using this class:
Order order = Order.ASC();
switch (order.code()) {
case Order.ASC:
System.out.println("Ascending order");
break;
case Order.DESC:
System.out.println("Descending order");
break;
default:
System.out.println("Unknown order");
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to follow SQL more closely.
/**
* <element reference> ::= <array value expression> <left bracket or trigraph> <numeric value expression> <right bracket or trigraph>
* Example: some_array_ref[2 + 2], some_ref['hello'], some_ref[CAST('' || 'a' AS STRING)]
*/
class ElementReference extends Expr {
@NotNull
Expr root;
@NotNull
Expr element;
}
/**
* <field reference> ::= <value expression primary> <period> <field name>
*/
class FieldReference {
@NotNull
Expr root;
@NotNull
Identifier field;
}
/**
* This is used as part of the projection list in SQL, but it directly uses <qualified asterisk>.
* <all fields reference> ::= <value expression primary> <period> <asterisk>
*/
class AllFieldsReference {
@NotNull
Expr root;
}
/**
* Below is not specified by SQL:1999.
* <all elements reference> ::= <value expression primary> <left bracket or trigraph> <asterisk> <right bracket or trigraph>
*/
class AllElementsReference {
@NotNull
Expr root;
}
Anyway, you could scrap the use of ExprPath
and ExprPathStep
, and builders would help with code-readability.
// SELECT * FROM t WHERE t['a'].b[0] > 2
val query = QueryBuilder()
.selectStar()
.from("t")
.where(ExprBuilder().id("t").elementRef("a").fieldRef("b").elementRef(0).gt(2))
.build()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, I like the existing modeling of path expressions for a couple reasons:
- Path is a commonly used term in the PartiQL spec whereas reference is somewhat overloaded (e.g. variable references)
- Modeling the steps as a list with steps ordered left to right rather than a recursive chain seems a bit more intuitive to me (probably since that's what I've often been working with in this code base).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From offline discussion, will model w/ a root and steps but will model the steps using a linked list rather than a java collection list. Also I'm coming around to the naming you had suggested for the path steps. I made some slight adjustments to be a less verbose (i.e. removing Reference
).
partiql-ast/src/main/java/org/partiql/ast/v1/expr/ExprBetween.java
Outdated
Show resolved
Hide resolved
partiql-ast/src/main/java/org/partiql/ast/v1/type/TypeCustom.java
Outdated
Show resolved
Hide resolved
partiql-ast/src/main/java/org/partiql/ast/v1/type/TypeFloat32.java
Outdated
Show resolved
Hide resolved
partiql-ast/src/main/java/org/partiql/ast/v1/type/TypeFloat64.java
Outdated
Show resolved
Hide resolved
partiql-ast/src/main/java/org/partiql/ast/v1/type/TypeVarchar.java
Outdated
Show resolved
Hide resolved
partiql-ast/src/main/java/org/partiql/ast/v1/type/TypeByteString.java
Outdated
Show resolved
Hide resolved
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class TypeAny extends Type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd defer to @jpschorr, but this may be a good opportunity for us to use the TypeDynamic
naming since these changes are breaking anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if TypeAny
/TypeDynamic
was used anywhere. I'll delete this class for now. Can always add back once there's a need.
partiql-ast/src/main/java/org/partiql/ast/v1/expr/ExprOverlay.java
Outdated
Show resolved
Hide resolved
partiql-ast/src/main/java/org/partiql/ast/v1/expr/ExprNullIf.java
Outdated
Show resolved
Hide resolved
* TODO docs, equals, hashcode | ||
*/ | ||
public class ExprIsType extends Expr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I'm aware, IS [NOT] <type>
is still in the implementation. Are you suggestion to remove it?
/** | ||
* TODO docs, equals, hashcode | ||
*/ | ||
public class ExprDateDiff extends Expr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really hope we don't need to add this to the AST. Same goes for DATE_ADD
. I wonder if there's a way we can rewrite this to be +/-
for datetimes and intervals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I don't have much context for DATE_ADD
/DATE_DIFF
being added. It's not in the SQL/PartiQL specs. Probably was added in lieu of interval support in the v0.1 version of PLK.
It's somewhat strange that we chose to model the first arg, DatetimeField
, as a keyword rather than a string, which means we needed to add a special parsing rule into the grammar. I'm not opposed to removing it for v1 but have concerns if some customers are relying on the existing syntax already.
Will discuss offline w/ you and other PLK folks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From offline discussion: we'll remove DATE_ADD
and DATE_DIFF
from the AST with the eventual plan to support the +/-
syntax with intervals.
To support customers dependent on these existing functions and special keyword syntax, we will still support parsing the special syntax but
- convert the datetime keyword (e.g.
month
,day
) into a string or - map to different functions for each variant (e.g.
date_add_month
,date_diff_day
)
DATE_ADD
and DATE_DIFF
will still have support in the plan and evaluator.
partiql-ast/src/main/java/org/partiql/ast/v1/expr/ExprCollection.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this! I've gone through most of the files, though several remain. I can re-review once some modifications have been made.
c15cf37
to
e975882
Compare
67c8172
to
252603a
Compare
partiql-ast/src/main/java/org/partiql/ast/v1/SelectProject.java
Outdated
Show resolved
Hide resolved
CROSS-ENGINE-REPORT ❌
Testing Details
Result Details
Now FAILING Tests ❌The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET: Click here to see
Now IGNORED Tests ❌The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. Now Passing Tests179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-469780F). Before merging, confirm they are intended to pass. The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. CROSS-COMMIT-REPORT ✅
Testing DetailsResult Details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to enforce immutable structures, we could write getters (no setters). Or use final. Or, in the javadocs state that properties should not be modified.
partiql-ast/src/main/java/org/partiql/ast/v1/expr/ExprCall.java
Outdated
Show resolved
Hide resolved
I'll save the discussion for mutable vs immutable fields for another PR/discussion. I like immutable structures for most purposes. However I've dealt with some annoying patterns when I needed to modify parts of the AST which is a common use case for tree rewriting. |
CROSS-ENGINE-REPORT ❌
Testing Details
Result Details
Now FAILING Tests ❌The following 1 test(s) were previously PASSING in BASE but are now FAILING in TARGET: Click here to see
Now IGNORED Tests ❌The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. Now Passing Tests179 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-DE2BFAF). Before merging, confirm they are intended to pass. The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. CROSS-COMMIT-REPORT ✅
Testing DetailsResult Details
|
Hm not sure why the conformance test commenting is publishing a new comment for every commit rather than editing an existing comment. Deleting the stale comments for now. I'll investigate further if this happens for other PRs. |
Relevant Issues
Description
foo.bar
)v1
codeOther Information
Updated Unreleased Section in CHANGELOG: [NO]
Any backward-incompatible changes? [NO]
Any new external dependencies? [NO]
Do your changes comply with the Contributing Guidelines
and Code Style Guidelines? [YES]
License Information
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.