perf: avoid read after create when possible #4041

Weakky · 2023-06-13T16:58:52Z

Overview

Implements a solution for inserts for prisma/prisma#5043.

A create operation is currently always followed by a read to fulfill the query selection set. This leads to 4 queries sent in total:

Open transaction
Create record
Read result
Commit transaction

In some cases, performing a create and getting the result back immediately in a single query using INSERT ... RETURNING is possible.

We can only do so when:

The connector supports it (Only Postgres & CockroachDB currently have support)
The create consists only of scalar fields of the parent model and no nested operation
The read selection consists only of scalars fields of the parent model and no relation fetch

Note that:

SQL Server is not supported 🔴. While it can technically be implemented but is risky to be supported as we have to use a hack to support INSERT OUTPUT which makes it fragile to return any type of columns
SQLite is not supported 🔴. It also supports INSERT ... RETURNING but the driver does not return column-type information on the way back which can break some coercions we currently perform & need.

Benchmarks

Given this datamodel:

model Movie {
  id          Int    @id @default(autoincrement())
  title       String
  year        Int
  description String

  cast    Actor[] // m2m
  reviews Review[] // one2m
}

And this query:

await prisma.movie.create({
  data: {
    year: 2023,
    title: '<inserted_movie>',
    description: '<placeholder_descrition>',
  },
});

Here are the results:

Before

cpu: Apple M1 Max
runtime: node v18.12.1 (arm64-darwin)

benchmark               time (avg)             (min … max)       p75       p99      p995
---------------------------------------------------------- -----------------------------
• prisma.movie.create(...)
---------------------------------------------------------- -----------------------------
PG (regular joins)    1.12 ms/iter  (625.29 µs … 23.42 ms) 908.21 µs   6.31 ms  22.18 ms
Prisma                4.52 ms/iter     (2.51 ms … 28.3 ms)   3.89 ms  27.89 ms   28.3 ms

summary for prisma.movie.create(...)
  PG (regular joins)
   4.05x faster than Prisma

After

cpu: Apple M1 Max
runtime: node v18.12.1 (arm64-darwin)

benchmark               time (avg)             (min … max)       p75       p99      p995
---------------------------------------------------------- -----------------------------
• prisma.movie.create(...)
---------------------------------------------------------- -----------------------------
PG (regular joins)    1.08 ms/iter  (652.38 µs … 22.85 ms) 882.83 µs   6.26 ms  22.48 ms
Prisma                1.42 ms/iter  (755.79 µs … 23.08 ms)   1.31 ms   6.88 ms  22.85 ms

summary for prisma.movie.create(...)
  PG (regular joins)
   1.32x faster than Prisma

Query differences

Given the datamodel & query above, here are the queries before:

BEGIN
INSERT INTO "z_imdb_bench"."Movie" ("title","year","description") VALUES ($1,$2,$3) RETURNING "z_imdb_bench"."Movie"."id"
SELECT "z_imdb_bench"."Movie"."id", "z_imdb_bench"."Movie"."title", "z_imdb_bench"."Movie"."year", "z_imdb_bench"."Movie"."description" FROM "z_imdb_bench"."Movie" WHERE "z_imdb_bench"."Movie"."id" = $1 LIMIT $2 OFFSET $3
COMMIT

And after:

INSERT INTO "z_imdb_bench"."Movie" ("title","year","description") VALUES ($1,$2,$3) RETURNING "z_imdb_bench"."Movie"."id", "z_imdb_bench"."Movie"."title", "z_imdb_bench"."Movie"."year", "z_imdb_bench"."Movie"."description"

codspeed-hq · 2023-06-13T17:11:45Z

CodSpeed Performance Report

Merging #4041 perf/avoid-read-after-create (e74b2ba) will not alter performances.

Summary

🔥 0 improvements
❌ 0 regressions
✅ 11 untouched benchmarks

🆕 0 new benchmarks
⁉️ 0 dropped benchmarks

query-engine/core/src/query_graph_builder/read/mod.rs

query-engine/core/src/query_graph_builder/read/utils.rs

jkomyno · 2023-06-21T15:30:14Z

query-engine/core/src/query_graph_builder/write/create.rs

+        graph.flag_transactional();
+
+        let create_node = create::create_record_node(graph, query_schema, model.clone(), data_map)?;
+
+        // Follow-up read query on the write
+        let read_query = read::find_unique(field, model.clone())?;
+        let read_node = graph.create_node(Query::Read(read_query));
+
+        graph.add_result_node(&read_node);
+        graph.create_edge(
+            &create_node,
+            &read_node,
+            QueryGraphDependency::ProjectedDataDependency(
+                model.primary_identifier(),
+                Box::new(move |mut read_node, mut parent_ids| {
+                    let parent_id = match parent_ids.pop() {
+                        Some(pid) => Ok(pid),
+                        None => Err(QueryGraphBuilderError::AssertionError(
+                            "Expected a valid parent ID to be present for create follow-up read query.".to_string(),
+                        )),
+                    }?;
+
+                    if let Node::Query(Query::Read(ReadQuery::RecordQuery(ref mut rq))) = read_node {
+                        rq.add_filter(parent_id.filter());
+                    };
+
+                    Ok(read_node)
+                }),
+            ),
+        )?;


Reviewer's note: this is just a chunk of code that already existed, but that is now wrapped in the else branch of a conditional

jkomyno · 2023-06-21T15:43:56Z

query-engine/core/src/query_graph_builder/write/create.rs

-            }),
-        ),
-    )?;
+    if can_use_atomic_create(query_schema, &model, &data_map, &field) {


Reviewer's note: this function is defined later

jkomyno · 2023-06-21T15:48:22Z

I've noticed we added InsertReturning to Postgres and CockroachDB, as the PR description states that only these drivers support INSERT INTO ... RETURNING ....

What about MariaDB (supported since 10.5.0)? It's perfectly fine to me if we consider too niche of a case - just wanted to make sure we are all aware that this feature exists.

Weakky · 2023-06-21T15:55:32Z

I've noticed we added InsertReturning to Postgres and CockroachDB, as the PR description states that only these drivers support INSERT INTO ... RETURNING ....

What about MariaDB (supported since 10.5.0)? It's perfectly fine to me if we consider too niche of a case - just wanted to make sure we are all aware that this feature exists.

We don't have the connector's version at runtime in the Query Engine, so we can't implement version-specific connector features (yet). This is the main reason why MariaDB isn't supported. It's also not it's own connector.

jkomyno

🚀

miguelff

I don't have any blocking feedback on this. Left a few questions and a couple of comments of internal factoring of the code that are unimportant.

I find the implementation neat, the PR description sterling and the comments useful. Great job, @Weakky!

psl/builtin-connectors/src/sqlite_datamodel_connector.rs

query-engine/connector-test-kit-rs/query-engine-tests/tests/new/metrics.rs

query-engine/core/src/query_graph_builder/write/create.rs

miguelff · 2023-06-22T11:24:18Z

query-engine/core/src/query_graph_builder/write/create.rs

+    if can_use_atomic_create(query_schema, &model, &data_map, &field) {
+        let create_node = create::atomic_create_record_node(graph, query_schema, model, data_map, field)?;
+
+        graph.add_result_node(&create_node);


nitpicking: atomic_create_record_node uses the graph to create nodes internally, and could include this operation. Is there any reason to make them separate?

Also, this branch of the code is pretty well extracted while the else branch inlines node´s creation. Would it make sense to have either both inlined or both extracted?

nitpicking: atomic_create_record_node uses the graph to create nodes internally, and could include this operation. Is there any reason to make them separate?

Is "this operation" referring to graph.add_result_node ? If so, then the reason is that we might not always want to make it a result node. It very much depends on what workflow we're creating with the graph. While this is the only use-case atm, we could very well have a different operation that uses this atomic_create_record_node but needs a completely different node as result node.

Also, this branch of the code is pretty well extracted while the else branch inlines node´s creation. Would it make sense to have either both inlined or both extracted?

I'm not sure I understand how the if branch differs from the else branch. In both cases, we're inlining nodes creation. The only difference is that the else branch has more stuff going on, such as adding a subsequent read node. Do you see it differently?

miguelff · 2023-06-22T12:00:55Z

query-engine/core/src/interpreter/query_interpreters/write.rs


-    Ok(QueryResult::Id(Some(res)))
+    Ok(QueryResult::RecordSelection(Box::new(RecordSelection {


How is that changing the semantics of the result from QueryResult::Id to QueryResult::RecordSelection don't break any internal contract? Aren't there used or eventually serialized different? if so, why do we have the two different enum variants in QueryResult?

Both QueryResult::Id and QueryResult::RecordSelection can be used as results. Have a look at this code here and here, where we extract results out of both.

I wasn't here when these abstractions were created so I can only assume why we have both. QueryResult::RecordSelection needs a lot more information as it ends up being serialized. QueryResult::Id is just a simplified version of RecordSelection when we know it's only used as a temporary result for another operation (eg: create --(create id)--> read).

This was holding well until now, but we're stretching the abstraction by making create's result now sometimes go through the serialization (in an atomic create), and sometimes not (when we still need a read after the create, for instance).

As said below, I could make create_record return a RecordSelection or an Id based on the type of create we're doing, but I didn't think it was worth it as we're perfectly able to extract the result out of both Id and RecordSelection. I do agree that this is all a bit fuzzy atm. It's often the case when slapping new concepts on top of existing abstractions without really evolving them. The same questions are gonna be relevant when I tackle atomic update, which works the exact same way.

My thoughts atm is that the abstractions still covers our need but we might have to change them a bit in the future if we add more subtleties to operations. Until then, I didn't want to prematurely introduce more concepts. With that being said, I'm 100% open to suggestions

miguelff · 2023-06-22T12:06:25Z

query-engine/core/src/query_graph_builder/write/create.rs

+
+    let cr = CreateRecord {
+        // A regular create record is never used as a result node. Therefore, it's never serialized, so we don't need a name.
+        name: String::new(),


Yup, this is immutable and creating an empty string has zero cost. I was wondering if given the name is not always necessary and should be an Option but it's perfectly fine and more straightforward to be this way.

The issue with having an Option<String> for the name is that eventually, that CreateRecord will be transformed into a RecordSelection, which requires a name (even though the ones that don't contain names won't end up serialized). So we'd have to create that String::new here anyway

I thought about creating an enum of CreateRecord { Atomic(), Regular } to make that explicit and either return a QueryResult::Id or a QueryResult::RecordSelection based on that, but I didn't want to make the ReadOperations::create_record interface more complex with different types of creates to handle.

I could've also, based on whether there's a name or not, returned a RecordSelection or an Id. But it felt hacky and worse than having an empty string.

perf: avoid read after create when possible

44a7455

Weakky added 16 commits June 13, 2023 19:12

add cockroachdb

64de1ca

fix metrics test

76c8dd7

fix low_precision_datetime test

0461e71

disable sqlite

27ddfca

fix mssql tests

60dfeda

fix mssql quaint test

5cba79f

fix quaint mssql test

bf938af

fix sqlite test

1ce5722

fix mssql tests

52e3eac

fix mssql tests

570c716

fix schema engine test

a5d8490

fix mssql tests

45f3b0b

fix quaint tests

117068f

fix schema engine test

35fd7e6

[skip ci] remove dbg

9a4662e

disable optimization on mssql

90eb050

Weakky added this to the 5.0.0 milestone Jun 20, 2023

cleanup

66ccdaf

Weakky marked this pull request as ready for review June 21, 2023 14:32

Weakky requested a review from a team as a code owner June 21, 2023 14:32

jkomyno reviewed Jun 21, 2023

View reviewed changes

query-engine/core/src/query_graph_builder/read/mod.rs Outdated Show resolved Hide resolved

jkomyno reviewed Jun 21, 2023

View reviewed changes

query-engine/core/src/query_graph_builder/read/utils.rs Outdated Show resolved Hide resolved

jkomyno reviewed Jun 21, 2023

View reviewed changes

review fix

e6ff9ae

jkomyno approved these changes Jun 21, 2023

View reviewed changes

Weakky force-pushed the perf/avoid-read-after-create branch 2 times, most recently from 13b5b99 to e6ff9ae Compare June 22, 2023 09:54

miguelff approved these changes Jun 22, 2023

View reviewed changes

review fix

e74b2ba

Jolg42 modified the milestones: 5.0.0, 5.1.0 Jul 10, 2023

Weakky merged commit 96f20d9 into main Jul 12, 2023
37 checks passed

Weakky deleted the perf/avoid-read-after-create branch July 12, 2023 16:32

Jolg42 mentioned this pull request Jul 17, 2023

chore(deps): update engines to 5.1.0-3.0b746fef790fd12bc5b02f4f8bad292b33ddb485 prisma/prisma#20231

Merged

Weakky mentioned this pull request Aug 7, 2023

Unnecessary SELECT statement For create,update prisma/prisma#12541

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: avoid read after create when possible #4041

perf: avoid read after create when possible #4041

Weakky commented Jun 13, 2023 •

edited

Loading

codspeed-hq bot commented Jun 13, 2023 •

edited

Loading

jkomyno Jun 21, 2023 •

edited

Loading

jkomyno Jun 21, 2023

jkomyno commented Jun 21, 2023

Weakky commented Jun 21, 2023 •

edited

Loading

jkomyno left a comment

miguelff left a comment •

edited

Loading

miguelff Jun 22, 2023

Weakky Jun 22, 2023

miguelff Jun 22, 2023

Weakky Jun 22, 2023 •

edited

Loading

miguelff Jun 22, 2023

Weakky Jun 22, 2023 •

edited

Loading


		Ok(QueryResult::Id(Some(res)))
		Ok(QueryResult::RecordSelection(Box::new(RecordSelection {

perf: avoid read after create when possible #4041

perf: avoid read after create when possible #4041

Conversation

Weakky commented Jun 13, 2023 • edited Loading

Overview

Benchmarks

Query differences

codspeed-hq bot commented Jun 13, 2023 • edited Loading

CodSpeed Performance Report

Summary

jkomyno Jun 21, 2023 • edited Loading

Choose a reason for hiding this comment

jkomyno Jun 21, 2023

Choose a reason for hiding this comment

jkomyno commented Jun 21, 2023

Weakky commented Jun 21, 2023 • edited Loading

jkomyno left a comment

Choose a reason for hiding this comment

miguelff left a comment • edited Loading

Choose a reason for hiding this comment

miguelff Jun 22, 2023

Choose a reason for hiding this comment

Weakky Jun 22, 2023

Choose a reason for hiding this comment

miguelff Jun 22, 2023

Choose a reason for hiding this comment

Weakky Jun 22, 2023 • edited Loading

Choose a reason for hiding this comment

miguelff Jun 22, 2023

Choose a reason for hiding this comment

Weakky Jun 22, 2023 • edited Loading

Choose a reason for hiding this comment

Weakky commented Jun 13, 2023 •

edited

Loading

codspeed-hq bot commented Jun 13, 2023 •

edited

Loading

jkomyno Jun 21, 2023 •

edited

Loading

Weakky commented Jun 21, 2023 •

edited

Loading

miguelff left a comment •

edited

Loading

Weakky Jun 22, 2023 •

edited

Loading

Weakky Jun 22, 2023 •

edited

Loading