Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relationships selected in SQL-based datastores now elide columns that have static values #2096

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

josephschorr
Copy link
Member

@josephschorr josephschorr commented Oct 21, 2024

This vastly reduces data over the wire, as well as deserialization time and memory usage

Fixes #59

@github-actions github-actions bot added area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels Oct 21, 2024
@josephschorr josephschorr force-pushed the rel-struct-sql branch 3 times, most recently from d6104da to 44f2cbb Compare October 22, 2024 16:33
@josephschorr josephschorr marked this pull request as ready for review October 22, 2024 16:49
@josephschorr josephschorr requested review from vroldanbet and a team as code owners October 22, 2024 16:49
@josephschorr josephschorr changed the title Relationships selected in SQL-based datastores should elide columns that have static values Relationships selected in SQL-based datastores now elide columns that have static values Oct 22, 2024
tstirrat15
tstirrat15 previously approved these changes Oct 22, 2024
Copy link
Contributor

@tstirrat15 tstirrat15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. What I saw looked good but I wouldn't mind having another set of eyes on it.

}

// QueryRelationships queries relationships for the given query and transaction.
func QueryRelationships[R Rows, C ~map[string]any](ctx context.Context, queryInfo QueryInfo, sqlStatement string, args []any, span trace.Span, tx Querier[R], withIntegrity bool) (datastore.RelationshipIterator, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the ~ mean in this context?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~T means the set of all types whose underlying type is T

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you did something like type MyType map[string]any, doing ~map[string]any will also accept that and not just explicitly map[string]any

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is exactly what we do

Comment on lines -197 to -199
colIntegrityKeyID,
colIntegrityHash,
colTimestamp,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For myself: this was moved into extraFields on the common.SchemaInformation struct.

Comment on lines +428 to +430
type wrappedTX struct {
tx querier
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What role does this play?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It translates the interface

ColCaveatName string
ColCaveatContext string
PaginationFilterType PaginationFilterType
PlaceholderFolder sq.PlaceholderFormat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Folder?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is this in here because this allows you to push more logic up into the common sql logic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Comment on lines +189 to +191
var resourceObjectType string
var resourceObjectID string
var relation string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean anything that these declarations moved out of the loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means they are on the heap, which isn't good, but it also means we don't need to recalculate them every iteration, which is good

@@ -1573,10 +1573,7 @@ func ConcurrentWriteSerializationTest(t *testing.T, tester DatastoreTester) {
<-waitToFinish
return err
})
if err != nil {
panic(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it this is unnecessary when we're using MustBugF?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in a test, so I changed it to just return the error

@@ -562,17 +654,57 @@ func (tqs QueryExecutor) ExecuteQuery(
limit = *queryOpts.Limit
}

toExecute := query.limit(limit)
if limit < math.MaxInt64 {
query = query.limit(limit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this conditional just to reduce what's going over the wire?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, yes

@josephschorr
Copy link
Member Author

Updated

@josephschorr josephschorr force-pushed the rel-struct-sql branch 2 times, most recently from b45f0a7 to 65b4e07 Compare October 28, 2024 15:47
@github-actions github-actions bot added the area/dispatch Affects dispatching of requests label Oct 28, 2024
@josephschorr josephschorr force-pushed the rel-struct-sql branch 2 times, most recently from aaad075 to 334d8ea Compare October 28, 2024 15:58
columns that have static values

This vastly reduces data over the wire, as well as deserialization time and memory usage
@github-actions github-actions bot added area/CLI Affects the command line area/api v1 Affects the v1 API labels Nov 5, 2024
This will allow us to centrally register additional datastore validation that only runs at test time
This validation test acts as a proxy in the memdb testing datastore and validates that the column elision code (which *isn't* used in memdb) matches the static fields to the values returned for all relationships loaded
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api v1 Affects the v1 API area/CLI Affects the command line area/datastore Affects the storage system area/dispatch Affects dispatching of requests area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Speculative: Run transaction rollback asynchronously
3 participants