Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6355] RelToSqlConverter[ORDER BY] generates an incorrect order by when NULLS LAST is used in non-projected field #3754

Merged
merged 1 commit into from
Apr 10, 2024

Conversation

bvolpato
Copy link
Contributor

@bvolpato bvolpato commented Apr 6, 2024

There are some issues in the needNewSubQuery -> hasSortByOrdinal method in SqlImplementor when using RelToSqlConverter specifying NULLS LAST, for some particular queries.

The method hasSortByOrdinal fails to recognize that the order might contain nested calls. For example, we can easily produce SqlSelect -> SqlNodeList order -> SqlBasicCall (nulls last) -> SqlBasicCall (desc) -> SqlNumericLiteral, but it can only match correctly if there's a single SqlBasicCall in the chain.

With that, this query:

select "product_id"
from "product"
where "net_weight" is not null
group by "product_id"
order by MAX("net_weight") desc nulls last 

Resolves to an invalid query. Notice the order by 2, however only one field was projected (product_id).

SELECT "product_id"
FROM "foodmart"."product"
WHERE "net_weight" IS NOT NULL
GROUP BY "product_id"
ORDER BY 2 DESC NULLS LAST 

With my fix, it produces the correct SELECT with subquery:

SELECT "product_id"
FROM (SELECT "product_id", MAX("net_weight") AS "EXPR$1"
FROM "foodmart"."product"
WHERE "net_weight" IS NOT NULL
GROUP BY "product_id"
ORDER BY 2 DESC NULLS LAST) AS "t3"

(I've also provided some context at https://issues.apache.org/jira/browse/CALCITE-6355).

@caicancai
Copy link
Member

@bvolpato Calcite's jira message does not allow the Fix keyword to begin with

@bvolpato bvolpato force-pushed the fix-order-by-desc-nulls-last branch from 94003ba to 683fbb2 Compare April 6, 2024 14:26
@bvolpato bvolpato changed the title [CALCITE-6355] Fix subquery check when using ORDER BY X DESC NULLS LAST [CALCITE-6355] Improve subquery check to consider chained ORDER BY calls Apr 6, 2024
@bvolpato bvolpato force-pushed the fix-order-by-desc-nulls-last branch 3 times, most recently from 6d35537 to 1ec65f4 Compare April 7, 2024 23:47
@bvolpato
Copy link
Contributor Author

bvolpato commented Apr 7, 2024

Attempted to fix the checker errors (added annotations & handled nullness). It hadn't failed ./gradlew build before, needed to explicitly -PenableCheckerframework.

@bvolpato bvolpato force-pushed the fix-order-by-desc-nulls-last branch from 1ec65f4 to 61c1b8b Compare April 8, 2024 01:06
@bvolpato
Copy link
Contributor Author

bvolpato commented Apr 8, 2024

This test is flaky / unrelated to my changes:

            Caused by: com.github.dockerjava.api.exception.DockerClientException: Could not pull image: [DEPRECATION NOTICE] Docker Image Format v1 and Docker Image manifest version 2, schema 1 support is disabled by default and will be removed in an upcoming release. Suggest the author of docker.io/library/redis:2.8.19 to upgrade the image to the OCI Format or Docker Image manifest v2, schema 2. More information at https://docs.docker.com/go/deprecated-image-specs/

                at app//com.github.dockerjava.api.command.PullImageResultCallback.checkDockerClientPullSuccessful(PullImageResultCallback.java:97)

                at app//com.github.dockerjava.api.command.PullImageResultCallback.throwFirstError(PullImageResultCallback.java:112)

                at app//com.github.dockerjava.api.async.ResultCallbackTemplate.awaitCompletion(ResultCallbackTemplate.java:93)

                at app//org.testcontainers.images.TimeLimitedLoggedPullImageResultCallback.awaitCompletion(TimeLimitedLoggedPullImageResultCallback.java:58)

                at app//org.testcontainers.images.RemoteDockerImage.resolve(RemoteDockerImage.java:97)

                ... 46 more



FAILURE  10.5sec,    1 completed,   1 failed,   0 skipped, org.apache.calcite.adapter.redis.RedisAdapterCaseBase

FAILURE  12.6sec,    2 completed,   1 failed,   1 skipped, Gradle Test Run :redis:test

@bvolpato bvolpato force-pushed the fix-order-by-desc-nulls-last branch 13 times, most recently from d047d03 to 28b1ff3 Compare April 8, 2024 18:23
Copy link
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise this looks pretty good.

@@ -2162,6 +2162,48 @@ private SqlDialect nonOrdinalDialect() {
.ok(prestoExpected);
}

/**
* Test case for the base case of
* <a href="https://issues.apache.org/jira/browse/CALCITE-6355">[CALCITE-6355]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far we have required the JIRA case title, the PR case title, and the comments in the code to use the exact same text.
This makes it easy to understand how these are related.
Can you please follow these guidelines?
There are lots of examples.

Copy link
Contributor Author

@bvolpato bvolpato Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, missed this. I've adjusted the title here -- thank you. Do I have to reword commits? Or the title of the PR will be used when merged / squashed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually after the PR is accepted you will have to squash all the commits to just one, and that one will have to have the exact same message as the JIRA title, and this PR title. Moreover, the JavaDoc for the test cases related to the fix also have to have a relatively rigid format, providing the link to the jira case and the full text in the jira title.

@bvolpato bvolpato changed the title [CALCITE-6355] Improve subquery check to consider chained ORDER BY calls [CALCITE-6355] RelToSqlConverter[ORDER BY] generates an incorrect order by when NULLS LAST is used in non-projected field Apr 9, 2024
Copy link
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me, but I am not very familiar with this part of Calcite.
If anyone else who knows this better could take a look I would feel more confident.
The test shows that this is doing something good, though.

@mihaibudiu mihaibudiu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Apr 10, 2024
@mihaibudiu
Copy link
Contributor

To merge this you will have to squash the commits into a single commit, and the commit message has to be exactly the JIRA title. Take a look at the other merged commits into main for examples.

…er by when NULLS LAST is used in non-projected field
@bvolpato bvolpato force-pushed the fix-order-by-desc-nulls-last branch from 28b1ff3 to be8917d Compare April 10, 2024 18:50
@bvolpato
Copy link
Contributor Author

To merge this you will have to squash the commits into a single commit, and the commit message has to be exactly the JIRA title. Take a look at the other merged commits into main for examples.

Done, thanks for your help @mihaibudiu!

Copy link

sonarcloud bot commented Apr 10, 2024

@mihaibudiu mihaibudiu merged commit 4ce1e16 into apache:main Apr 10, 2024
17 checks passed
@bvolpato bvolpato deleted the fix-order-by-desc-nulls-last branch April 10, 2024 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM-will-merge-soon Overall PR looks OK. Only minor things left.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants