Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: unnesting arbitrary subqueries (likely broken) #180

Merged
merged 15 commits into from
Oct 24, 2024
Merged

Conversation

jurplel
Copy link
Member

@jurplel jurplel commented May 1, 2024

Somewhere between a proof of concept and a draft—work still heavily in progress. Will already successfully parse and fully unnest a subset of correlated and uncorrelated subqueries (although I am uncertain about correctness).

TODO:

  • Formal testing
  • EXISTS clauses
  • IN clauses
  • ANY/ALL clauses
  • Correctness issue with COUNT(*) (requires adding left outer join to plan)
  • Move some/all of this to rewriting stage to support multiple subqueries/ordering operations
  • “Sideways information passing” (subplans are duplicated now instead of making a DAG)
    • It seems that a DAG representation is only supported by looking for groups that appear the same. It looks to me that the cloned branches generated by this PR are indeed marked with the same group ID. I marked this bullet point as completed with this in mind.
  • Support more pushdowns (e.g. limit, joins)
  • Optimizations from the paper are all missing (Out of scope?)

@jurplel jurplel changed the title Unnesting Arbitrary Queries (Highly WIP) [WIP] Unnesting Arbitrary Subqueries Aug 12, 2024
@skyzh skyzh changed the title [WIP] Unnesting Arbitrary Subqueries feat: unnesting Arbitrary Subqueries (likely broken) Oct 24, 2024
@skyzh skyzh changed the title feat: unnesting Arbitrary Subqueries (likely broken) feat: unnesting arbitrary subqueries (likely broken) Oct 24, 2024
@skyzh skyzh marked this pull request as ready for review October 24, 2024 01:40
@skyzh skyzh self-requested a review October 24, 2024 01:40
Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rubber stamp approval -- let's break things and fix things

@skyzh
Copy link
Member

skyzh commented Oct 24, 2024

TODO on the tooling side:

  • with_logical is super confusing, this means "with data fusion logical optimizer", we should either doc it or have a human-readable name
  • I think now we need explain_after_heuristic so that we can print out a so-called optimized logical plan (subquery un-nesting happens in the heuristic optimizer).
  • Probably also need to fix the heuristic optimizer things?
  • Also we should consider bump compiler version and upgrade to Rust 2024 edition

@skyzh

This comment was marked as duplicate.

Signed-off-by: Alex Chi <[email protected]>
@skyzh skyzh merged commit 5065c42 into main Oct 24, 2024
1 check passed
@skyzh skyzh deleted the bowad/subquery-unnest branch October 24, 2024 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants