Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full Outer Join Does not Properly Display Right-Values on Joined Column Without Match #5184

Open
nbauernfeind opened this issue Feb 22, 2024 · 1 comment · May be fixed by #5190
Open

Full Outer Join Does not Properly Display Right-Values on Joined Column Without Match #5184

nbauernfeind opened this issue Feb 22, 2024 · 1 comment · May be fixed by #5190
Assignees
Labels
bug Something isn't working core Core development tasks March2024 query engine
Milestone

Comments

@nbauernfeind
Copy link
Member

nbauernfeind commented Feb 22, 2024

Right values are not being provided in a full outer join when there is no matching left group.

from deephaven.experimental.outer_joins import full_outer_join, left_outer_join
from deephaven import empty_table

left = empty_table(5).update(["X1 = ii", "Y = Math.sin(X1)"])
right = empty_table(5).update(["X2 = ii * 2", "Y = Math.cos(X2)"])

result = full_outer_join(l_table=left, r_table=right, on=["X1 = X2"], joins=["Z = Y"])
Screenshot 2024-02-22 at 8 37 47 AM

It does work, however, if you first rename the right column:

from deephaven.experimental.outer_joins import full_outer_join, left_outer_join
from deephaven import empty_table

left = empty_table(5).update(["X1 = ii", "Y = Math.sin(X1)"])
right = empty_table(5).update(["X2 = ii * 2", "Y = Math.cos(X2)"])

result = full_outer_join(l_table=left, r_table=right.rename_columns(["X1 = X2"]), on=["X1"], joins=["Z = Y"])
Screenshot 2024-02-22 at 8 38 01 AM

Thanks to @robbcamera for reporting.

@nbauernfeind nbauernfeind added bug Something isn't working query engine core Core development tasks labels Feb 22, 2024
@nbauernfeind nbauernfeind self-assigned this Feb 22, 2024
@nbauernfeind nbauernfeind linked a pull request Feb 23, 2024 that will close this issue
@nbauernfeind
Copy link
Member Author

Initial approach in #5190 wasn't quite good enough. This is the approach we've agreed on now:

  • Do not include RHS columns by default (do not merge by default as is today)
  • Add a union parameter that uses LHS column names and merges RHS into them (new feature)
  • Allow RHS renames via the join-list (if the join-list is empty it brings all RHS columns, not jut key-columns; existing behaviors)
  • Fail if joined columns have any name collisions (existing behavior)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core Core development tasks March2024 query engine
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants