-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: avoid read after create when possible #4041
Conversation
CodSpeed Performance ReportMerging #4041 Summary
|
graph.flag_transactional(); | ||
|
||
let create_node = create::create_record_node(graph, query_schema, model.clone(), data_map)?; | ||
|
||
// Follow-up read query on the write | ||
let read_query = read::find_unique(field, model.clone())?; | ||
let read_node = graph.create_node(Query::Read(read_query)); | ||
|
||
graph.add_result_node(&read_node); | ||
graph.create_edge( | ||
&create_node, | ||
&read_node, | ||
QueryGraphDependency::ProjectedDataDependency( | ||
model.primary_identifier(), | ||
Box::new(move |mut read_node, mut parent_ids| { | ||
let parent_id = match parent_ids.pop() { | ||
Some(pid) => Ok(pid), | ||
None => Err(QueryGraphBuilderError::AssertionError( | ||
"Expected a valid parent ID to be present for create follow-up read query.".to_string(), | ||
)), | ||
}?; | ||
|
||
if let Node::Query(Query::Read(ReadQuery::RecordQuery(ref mut rq))) = read_node { | ||
rq.add_filter(parent_id.filter()); | ||
}; | ||
|
||
Ok(read_node) | ||
}), | ||
), | ||
)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewer's note: this is just a chunk of code that already existed, but that is now wrapped in the else
branch of a conditional
}), | ||
), | ||
)?; | ||
if can_use_atomic_create(query_schema, &model, &data_map, &field) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewer's note: this function is defined later
I've noticed we added What about MariaDB (supported since 10.5.0)? It's perfectly fine to me if we consider too niche of a case - just wanted to make sure we are all aware that this feature exists. |
We don't have the connector's version at runtime in the Query Engine, so we can't implement version-specific connector features (yet). This is the main reason why MariaDB isn't supported. It's also not it's own connector. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
13b5b99
to
e6ff9ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any blocking feedback on this. Left a few questions and a couple of comments of internal factoring of the code that are unimportant.
I find the implementation neat, the PR description sterling and the comments useful. Great job, @Weakky!
query-engine/connector-test-kit-rs/query-engine-tests/tests/new/metrics.rs
Show resolved
Hide resolved
if can_use_atomic_create(query_schema, &model, &data_map, &field) { | ||
let create_node = create::atomic_create_record_node(graph, query_schema, model, data_map, field)?; | ||
|
||
graph.add_result_node(&create_node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpicking: atomic_create_record_node
uses the graph to create nodes internally, and could include this operation. Is there any reason to make them separate?
Also, this branch of the code is pretty well extracted while the else branch inlines node´s creation. Would it make sense to have either both inlined or both extracted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpicking: atomic_create_record_node uses the graph to create nodes internally, and could include this operation. Is there any reason to make them separate?
Is "this operation" referring to graph.add_result_node
? If so, then the reason is that we might not always want to make it a result node. It very much depends on what workflow we're creating with the graph. While this is the only use-case atm, we could very well have a different operation that uses this atomic_create_record_node
but needs a completely different node as result node.
Also, this branch of the code is pretty well extracted while the else branch inlines node´s creation. Would it make sense to have either both inlined or both extracted?
I'm not sure I understand how the if
branch differs from the else
branch. In both cases, we're inlining nodes creation. The only difference is that the else branch has more stuff going on, such as adding a subsequent read
node. Do you see it differently?
|
||
Ok(QueryResult::Id(Some(res))) | ||
Ok(QueryResult::RecordSelection(Box::new(RecordSelection { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is that changing the semantics of the result from QueryResult::Id
to QueryResult::RecordSelection
don't break any internal contract? Aren't there used or eventually serialized different? if so, why do we have the two different enum variants in QueryResult
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both QueryResult::Id
and QueryResult::RecordSelection
can be used as results. Have a look at this code here and here, where we extract results out of both.
I wasn't here when these abstractions were created so I can only assume why we have both. QueryResult::RecordSelection
needs a lot more information as it ends up being serialized. QueryResult::Id
is just a simplified version of RecordSelection
when we know it's only used as a temporary result for another operation (eg: create --(create id)--> read).
This was holding well until now, but we're stretching the abstraction by making create's result now sometimes go through the serialization (in an atomic create), and sometimes not (when we still need a read after the create, for instance).
As said below, I could make create_record
return a RecordSelection
or an Id
based on the type of create we're doing, but I didn't think it was worth it as we're perfectly able to extract the result out of both Id
and RecordSelection
. I do agree that this is all a bit fuzzy atm. It's often the case when slapping new concepts on top of existing abstractions without really evolving them. The same questions are gonna be relevant when I tackle atomic update, which works the exact same way.
My thoughts atm is that the abstractions still covers our need but we might have to change them a bit in the future if we add more subtleties to operations. Until then, I didn't want to prematurely introduce more concepts. With that being said, I'm 100% open to suggestions
|
||
let cr = CreateRecord { | ||
// A regular create record is never used as a result node. Therefore, it's never serialized, so we don't need a name. | ||
name: String::new(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, this is immutable and creating an empty string has zero cost. I was wondering if given the name is not always necessary and should be an Option but it's perfectly fine and more straightforward to be this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with having an Option<String>
for the name is that eventually, that CreateRecord will be transformed into a RecordSelection, which requires a name (even though the ones that don't contain names won't end up serialized). So we'd have to create that String::new
here anyway
I thought about creating an enum of CreateRecord { Atomic(), Regular }
to make that explicit and either return a QueryResult::Id
or a QueryResult::RecordSelection
based on that, but I didn't want to make the ReadOperations::create_record
interface more complex with different types of creates to handle.
I could've also, based on whether there's a name or not, returned a RecordSelection
or an Id
. But it felt hacky and worse than having an empty string.
Overview
Implements a solution for inserts for prisma/prisma#5043.
A create operation is currently always followed by a read to fulfill the query selection set. This leads to 4 queries sent in total:
In some cases, performing a
create
and getting the result back immediately in a single query usingINSERT ... RETURNING
is possible.We can only do so when:
Note that:
INSERT OUTPUT
which makes it fragile to return any type of columnsINSERT ... RETURNING
but the driver does not return column-type information on the way back which can break some coercions we currently perform & need.Benchmarks
Given this datamodel:
And this query:
Here are the results:
Before
After
Query differences
Given the datamodel & query above, here are the queries before:
And after: