-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft][Please review] Make Paginator use simpler queries when there are no sql joins used (#8278) #11595
base: 3.3.x
Are you sure you want to change the base?
[Draft][Please review] Make Paginator use simpler queries when there are no sql joins used (#8278) #11595
Conversation
…octrine#8278) This is work in progress.
…octrine#8278) This is work in progress.
src/Tools/Pagination/Paginator.php
Outdated
} | ||
|
||
// When not joining onto *ToMany relations, then do not use the more complex CountOutputWalker. | ||
// phpcs:ignore SlevomatCodingStandard.ControlStructures.UselessIfConditionWithReturn.UselessIfCondition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I think it's better not to enforce the failing coding standard on that if-condition because otherwise it's a multiline return
abomination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me why the return
abomination would be worse than the if
abomination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The way I see it is that this function is a set of if-statements that each check for a very individual use case. If none of the statements "wants to make a decision", then the default "return true;" is returned.
- The if-condition for the new if-statement is quite elaborate, so I didn't want it to be the last statement of the function.
- The next dev that wants to add more if-statements to this function, would have to "un-return" my if-statement. This argument is my common argument for all phpstan rules that enforce "return statement code style" rules, which often in my mind are "false positives".
These points are semantics + I'm the "guest" here, so I've merged the last two return statements, as per phpstan detection.
Triggering a new build to address 4. , #11596 should fix it. |
src/Tools/Pagination/Paginator.php
Outdated
public function __construct( | ||
Query|QueryBuilder $query, | ||
private readonly bool $fetchJoinCollection = true, | ||
bool|null $fetchJoinCollection = true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would revert this and instead introduce a static method.
public static function newWithAutoDetection(QueryBuilder $queryBuilder): self
{
// do the autodetection here
// use the unaltered constructor here
}
That way:
- You remove the possibility of passing a
Query
instance andnull
to the constructor. - You don't complexify the constructor that much.
- You don't rely on
null
for communicating failure. - You can reduce the comments a bit because you don't need to explain how both parameters related to each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I've introduced the static function as you advised and the constructor returned to its original form in its entirety. Thanks.
src/Tools/Pagination/Paginator.php
Outdated
} | ||
|
||
// When not joining onto *ToMany relations, then do not use the more complex CountOutputWalker. | ||
// phpcs:ignore SlevomatCodingStandard.ControlStructures.UselessIfConditionWithReturn.UselessIfCondition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me why the return
abomination would be worse than the if
abomination.
src/Tools/Pagination/Paginator.php
Outdated
if ( | ||
$forCountQuery | ||
&& $this->queryStyleAutoDetectionEnabled | ||
&& ! $this->fetchJoinCollection === false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't compare boolean variable with boolean constants
&& ! $this->fetchJoinCollection === false | |
&& $this->fetchJoinCollection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's surprising to me that I left that in the code, despite reviewing it twice myself. Thanks.
…octrine#8278) This is work in progress.
$countQuery->setHint(CountWalker::HINT_DISTINCT, true); | ||
$hintDistinctDefaultTrue = true; | ||
|
||
// When not joining onto *ToMany relations, then use a simpler COUNT query in the CountWalker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fetchJoinCollection
is not only about whether we join on a ToMany relation, but whether we include this relation in what get fetched by the ORM. So the condition you use does not match what the comment says.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean. It appears I misunderstood the purpose of that variable.
I gave a thought about what to do with it, and I initially concluded that that variable is just misnamed. Consider the following:
- The
toIterator()
's main if-statement (link) executes the extra "SELECT DISTINCT" subquery only iffetchJoinCollection
is true. Otherwise, the original query is executed as-provided (with an OFFSET and LIMIT attached to it). Note that iffetchJoinCollection
is supposed to be true only when (a) the original query joins onto a *ToMany relation AND only when (b) the query fetch-loads that relation, thentoInterator()
would not run the extra "SELECT DISTINCT" subquery when the query does the first without the second, which would produce an incorrect result for the "current pagination page". The result would be incorrect because even though the *ToMany relation is not fetch-loaded, the database would still return (and paginate on) "duplicate records" of the entities that are fetch-loaded (which most likely is only the "root" entity).
In other words, both the following DQL queries require the extra "SELECT DISTINCT" subquery to happen in the toIterator()
, otherwise the result is incorrect:
SELECT u FROM User u JOIN u.tags ut;
SELECT u, ut FROM User u JOIN u.tags ut;
- The docblocks for the
fetchJoinCollection
consistently say the following:
/** @param bool $fetchJoinCollection Whether the query joins a collection (true by default). */
// (...)
/**
* Returns whether the query joins a collection.
*
* @return bool Whether the query joins a collection.
*/
public function getFetchJoinCollection(): bool {}
It appears then that the docblocks are under an impression that that variable is "true" regardless of whether a *ToMany relation is fetch-loaded or not.
Do you agree that that instance variable is not named correctly (or the author incorrectly thought that not fetch-loading a *ToMany relation is immune to the "duplicated records being returned from the database" problem)? Would you agree for me to just rename that instance variable to something like: "queryHasToManyJoins"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, maybe it is indeed badly named.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that there is a case where joined collections that are not selected will not impact the count or the iteration: when using SELECT DISTINCT
in the query (see SelectClause::$isDistinct
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so that I understand: you're trying to say that running the following DQLs without an "outputWalker" would be fine because it would produce the correct results, therefore the literal meaning of fetchJoinCollection
is the correct meaning?
SELECT DISTINCT u FROM User u JOIN u.userGroups;
SELECT COUNT(DISTINCT u) FROM User u JOIN u.userGroups;
I agree, however note that in the current code there is nothing that would ensure that DISTINCT is added/present in the toInterator()
's else
branch of the main if-else-statement. I.e. what you say is correct, but the current code would still either not work, or work by an extreme fluke (i.e. because someone did add a DISTINCT keyword to their $query
prior to using the Paginator). Based on this, I would still say that fetchJoinCollection
should just be renamed to reflect the fact that it's actual present meaning is that " whether the query contains ToMany joins (true/false)".
In case there is a legitimate worry that there might be someone in the userspace that does the (undocumented) due diligence and adds the DISTINCT keyword to their queries when they set fetchJoinCollection
to false, and we don't want to break that code, then I can introduce a new queryHasToManyJoins
instance variable, and use that instead of fetchJoinCollection
(in which case, I would also need to find out what I need to default fetchJoinCollection
to, but that would not be an unsurmountable problem).
Other that that, I admit that there is a potential future improvement, where useOutputWalkers(forCountQuery: true)
could still returns false (i.e. saying that the "simple" count query should be used) even when queryHasToManyJoins
is true, however fetchJoinCollection
is false, and we know that a COUNT(DISTINCT foo)
is going to be used.
So, are you ok for me to rename that instance variable, or would you rather that I should not touch it, and instead introduce a new instance variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was a lot to take in for me.
After a substantial amount of thought, I would like to propose and get your opinion on the following.
-
Rename the
$fetchJoinCollection
to$query(Potentially)ProducesDuplicateRows: bool
.
1.1. Why: The new name expresses the "root of the problem", and does not go into details about "how it comes about". It also succinctly expresses why pagination is not a trivial task.
1.2. What it achieves: it allows the end user to correctly express their confidence (or lack of it) in whether their query produces duplicate rows (which necessitate that the Paginator should run more complex queries). In particular: it puts the onus on the end user to ensure that this "declaration" (in particular: when they "declare false") is accurate.
1.3. What it achieves: In case the end user's query has aToMany
join, but it doesn't select any fields from it (or only selects SQL aggregate functions that use those fields), and the query uses a "SELECT DISTINCT" (or an appropriate GROUP BY), then the end user can succinctly inform the Paginator that their query is safe for the "fast result & count queries" without explaining the minute details about what exactly allows those fast queries to be used.
1.4. Bonus: It would "fix" what I said somewhere above that currently when$fetchJoinCollection
is false, then thetoIterator()
runs the "fast result query", however there is nothing that indicates that the end user did the due diligence of usingSELECT DISTINCT
. The new name of$queryProducesDuplicateRows=false
would in this case communicate the intended use case (and requirements) to the end user in a more complete way than$fetchJoinCollection
.
1.5. Breaking change handling: thegetFetchJoinCollection()
getter would be preserved but deprecated. A new appropriate getter would be introduced and recommended as a replacement.
1.6. Why: This new name would resolve the review comment that was raised by you in relation to line 257 in the PR.
1.7. Side note: "$fetchJoinCollection" is just a terrible name, as it suggests that the Paginator would do something to the joined collections in the query. -
Split
$useOutputWalkers
into$useResultQueryOutputWalker
and$useCountQueryOutputWalker
.
2.1. Why: Apart from what you said, splitting this config variable just makes sense. The two output walkers (i.e. on the result query and the count query) have different reasons to be enabled/disabled.
2.2. Breaking change handling: the(get|set)UseOutputWalkers()
functions would be preserved but deprecated. A new set of functions (4 in total) would be introduced and recommended as a replacement.
2.3. Downside: There would effectively be one more pair of public functions to (optionally) configure the Paginator. But again: it should be viewed as a long overdue code change, rather than an questionable and optional addition.
2.4. Benefit: It would allow end users to manually control whether they are confident to run the "fast" count query, without affecting the runtime of thetoIterator()
function. -
The auto-detection/guessing would investigate the given Query and appropriately set the variables mentioned above.
3.1. The auto-detection would initially not attempt to auto-detect whether a presence of aToMany
join is compatible with the "fast" queries. This is to limit the scope of this PR. It obviously could be attempted in future PRs. As such, as soon as there is aToMany
join detected, the "fast" queries would not be used (which is equivalent to the current (default) behaviour of the Paginator).
Q1. Are you personally ok with the proposition above?
the fast counting query is compatible with toMany joins when using CountWalker::HINT_DISTINCT as it will count distinct values
Q2. Could you please confirm that you meant: is compatible with toMany joins when using CountWalker::HINT_DISTINCT, AND when the query does NOT select any of the toMany entities/fields
? Because when the query selects any of the toMany fields, then it will make the CountWalker (i.e. the fast query) not work correctly, regardless of whether CountWalker::HINT_DISTINCT is added or not.
when no fields from the grouping condition is dependant on the ToMany join (i.e. is part of the joined entity or of other entities joined again from that one)
Thanks for mentioning the case of other entities joined again from that one
. This is quite important and I did not initially think of it.
[the points about when the walker-based fetching queries and tree-walker fetching queries are supported]
That is good to know, however note that I do not intend to touch the region of the code that makes the decision whether the walker-based or tree-walker fetching query should be used.
[The "fast" result query] is also the only result fetching mode compatible with composite primary keys.
Fair point, however this again is not something that I'm going to change or touch in my PR. Note that if a query uses composite primary keys AND produces duplicate rows (i.e. is not compatible with the "fast" result query), then the Paginator will not able to be used at all, and it's no different to the current situation in the Paginator.
Regarding output walker vs tree walker, the output walker supports more cases (...) but is generally less performant than then tree-walker query when both are supported.
This again is good to know, however I don't intend to auto-detect/guess in this PR which of the two walkers should be used by the result-fetching query.
Could you reply to Q1 and Q2, please, and share any other thought you have about what I said above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is good to know, however note that I do not intend to touch the region of the code that makes the decision whether the walker-based or tree-walker fetching query should be used.
your code provides guessing for use*OutputWalker
variables, which is precisely about making that decision. So I don't understand what you intend to do if you don't intend to touch that decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, are you referring to the following diff in the current PR?
private function useOutputWalker(Query $query): bool
{
if ($this->useOutputWalkers === null) {
return (bool) $query->getHint(Query::HINT_CUSTOM_OUTPUT_WALKER) === false;
}
return $this->useOutputWalkers;
}
->
private function useOutputWalker(Query $query, bool $forCountQuery = false): bool
{
if ($this->useOutputWalkers !== null) {
return $this->useOutputWalkers;
}
// When a custom output walker already present, then do not use the Paginator's.
if ($query->getHint(Query::HINT_CUSTOM_OUTPUT_WALKER) !== false) {
return false;
}
// When not joining onto *ToMany relations, then do not use the more complex CountOutputWalker.
return ! (
$forCountQuery
// (rest of the diff not relevant)
);
}
Do we both agree that this code diff does not result in any change to whether the walker-based or the tree-walker result-fetching query should be used?
Or do you refer to the proposition of splitting useOutputWalker
into two variables? In that case I can assure you that I do not intend to modify the current logic for the $useResultQueryOutputWalker
variable. It could be done in future PRs, most likely by different contributors than me.
Note that I started this PR with only the intention of adding code that "if there are any dql-joins present, then use the fast version of the queries". I believe this alone would immediately address 80% of the use cases in the wild. Anything more than this, I consider a nicety. However, I do agree that going the extra mile and checking for *ToMany dql-joins should be easy enough, so as an exception from my original plan, I'm willing to attempt to make that change in the current PR. Anything other than that is a "less than 5% of use cases" matter in my mind, and although I do appreciate us discussing it, I do not intend to address it in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when $fetchJoinCollection
is true
(the default), forbidding to use the fast result fetching query, useOutputWalker
is what decides between fetching results with an output walker or a tree walker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when
$fetchJoinCollection
is true (the default), forbidding to use the fast result fetching query,
I agree.
when
$fetchJoinCollection
is true (the default) (...), useOutputWalker is what decides between fetching results with an output walker or a tree walker.
I agree.
useOutputWalker is what decides between fetching results with an output walker or a tree walker.
Do we both agree that my code diff does not change the behaviour of useOutputWalker
?
when
$fetchJoinCollection
is true (the default)
Are you trying to say that because the new Paginator::newWithAutoDetection()
function in this PR is now going to decide whether the (former) $fetchJoinCollection
is true or false, it means that I should pay more attention to when the function decides that $fetchJoinCollection
should be "set" to true, because this implies that useOutputWalker
is going to be run? To this I would say that I partially see the concern here, however I don't view Paginator::newWithAutoDetection()
function as deciding whether the $fetchJoinCollection
should be set to true, but rather whether the $fetchJoinCollection
should "revert to the current behaviour of the Paginator" (i.e. setting the variable to true). It's not ideal the the default behaviour of Paginator::newWithAutoDetection()
could lead to problems in some cases, but likewise the current default behaviour of the Paginator class would lead to the same problems. And for this reason, I do not see why I should go even farther with this PR by making sure that the details about the output/tree walkers in the result-fetching query are taken into the consideration. Please correct me if I'm wrong.
If you're not referring to the consequences of the Paginator::newWithAutoDetection()
function, then I truly struggle to see what we are discussing here, and I would appreciate some more help in understanding your standpoint. Why exactly do you believe I should concern myself with the the output/tree walkers in the result-fetching query in this PR?
|
||
// When not joining onto *ToMany relations, then do not use the more complex CountOutputWalker. | ||
return ! ( | ||
$forCountQuery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we have a separate detection for count query and result query, I would rather extract the check for $forCountQuery
in a separate query (as maybe we will also have smarter detection for the case of the result query in the future)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You said:
extract the check for $forCountQuery in a separate query
Did you mean to say "extract (...) in a separate function"? In other words, are you proposing splitting the useOutputWalker()
into useResultOutputWalker()
and useCountOutputWalker()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, I meant a separate condition
(no need for a separate function which would force to duplicate the previous logic). We could even factor out the auto-detection check:
This would look like that:
// ...
if (!$this->queryStyleAutoDetectionEnabled) {
return true; // Old simple guessing
}
if ($forCountQuery) {
return /* condition for the support of CountWalker */;
}
return /* condition for the support of LimitSubqueryWalker */;
public static function newWithAutoDetection(QueryBuilder $queryBuilder): self | ||
{ | ||
/** @var array<string, Join[]> $joinsPerRootAlias */ | ||
$joinsPerRootAlias = $queryBuilder->getDQLPart('join'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to also support Query
by analyzing the DQL AST instead of the QueryBuilder ? this would make this method more useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I would prefer to support
Query
in addition toQueryBuilder
, not instead of. This is because constructing an AST from a Query's DQL is not instantaneous, while checking for presence of DQL parts in the QueryBuilder is. - My main use case for the Paginator (and I would assume that of a lot of other real-world usages too), is to implement CRUD listings that support filtering/conditions. The conditions are only added when selected by the user in the website gui. Since the conditions are added in a "conditional manner", the DQL query that retrieves the records for the CRUD listing is often built incrementally via a QueryBuilder and a bunch of if-statements (in one form or another -- I actually use a Symfony Bundle for the "bunch of if-statement" part). My point being: supporting the instances of Query is not my personal main priority. I could look into doing it, if I have the time after getting the "initial approval" for this PR (and I still have the unit tests to write).
- Supporting
Query|QueryBuilder
is definitely an option.
Are you ok with this new function's supporting only instances of QueryBuilder if I decide that I ran out of time for this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@d-ph Supporting QueryBuilder
in addition to Query
would then work the same than what is done in the constructor: calling getQuery
on it and then dealing with the Query.
I'm not asking for the public API to drop support for QueryBuilder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting queries is important to support cases using Query::setFetchMode
to trigger eager loading of some toOne
relations (which would load them in batch in a single query instead of doing N lazy-loading queries).
In my experience, this can result in better performance in some cases compared to fetch-joining the relation in the paginated query (assuming the paginated query does not already join it for filtering purposes of course):
- if the toOne relation is nullable and is often
null
- if you have multiple
toOne
relations pointing to the same entity (the batch loading is per target entity, not per relation, so they would be grouped) - if the target entity has a non-negligeable probably to refer to an instance that is already initialized in the identity map (if the current page only contains values using
null
or an initialized entity, the eager loading won't need to perform a second query at all).
The QueryBuilder cannot configuring the fetch mode of relations. So even when using the QueryBuilder (which I often do), I often have cases where I pass a Query
to the Paginator instead of passing the QueryBuilder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting
QueryBuilder
in addition toQuery
would then work the same than what is done in the constructor: callinggetQuery
on it and then dealing with the Query.
What about the comment I raised in point 1. above about the cost of constructing a Query's AST?
My intention was to at some point add the support for Query, but in a form where there would be separate routines for inspecting the provided instance of Query or QueryBuilder, so that the AST construction is avoided when a QueryBuilder is provided. Something along the following lines:
$sqlJoinsPresent = false;
if ($queryOrQueryBuilder instance of Query) {
$ast = $queryOrQueryBuilder->getAST();
// inspect the AST
$sqlJoinsPresent = true;
} else {
// do not construct any ASTs
// use QueryBuilder::getPart(partName) to inspect the query
$sqlJoinsPresent = true;
}
Are you saying that you'd like to disregard the fact that constructing an AST is not a cheap operation (and which depends on the complexity of the query)?
The QueryBuilder cannot configure the fetch mode of relations.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could accept a fast path introspecting the QueryBuilder instead of parsing the Query as an optimization if it has indeed an impact. However, it would have to support the same capabilities. See my other comment related to the detection of joining collections rather than any joins. Maybe the QueryBuilder introspection would not be viable for that.
However, my personal opinion would be that we should not merge this without support for Query as it would prevent downstream projects from migrating to it (for instance, the Pagerfanta DoctrineOrmAdapter uses this Paginator and currently supports Query|QueryBuilder
).
Note that this is only my own opinion and the ORM maintainers might take a different decision (I'm maintainer on DoctrineBundle but not on the ORM itself, even though I regularly help with reviews here as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stof when there are multiple to-one associations and the eager loading fetch mode is used, does this cause multiple result rows for a single entity and this is something where we need to be careful with regard to the goal of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to-one associations don't create multiple rows in the SQL results. Only toMany
relations do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, even when there are multiple of them 🤦🏻♂️
@@ -38,6 +40,16 @@ class Paginator implements Countable, IteratorAggregate | |||
private readonly Query $query; | |||
private bool|null $useOutputWalkers = null; | |||
private int|null $count = null; | |||
/** | |||
* @var bool The auto-detection of queries style was added a lot later to this class, and this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @var bool The auto-detection of queries style was added a lot later to this class, and this | |
* The auto-detection of queries style was added a lot later to this class, and this |
For a property phpdoc, the description is just the phpdoc description, not something attached to the tag (and then, we don't need @var
at all as there is a native type).
/** @var array<string, Join[]> $joinsPerRootAlias */ | ||
$joinsPerRootAlias = $queryBuilder->getDQLPart('join'); | ||
|
||
// For now, only check whether there are any sql joins present in the query. Note, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to implement the proper detection of joining collections.
Taking the code of my work project, almost all my paginated queries involve some toOne joins but the majority of them don't join a collection (making them compatible with the fast queries, which would then mean that switching to auto-detection would be a performance regression right now).
|
||
// For now, only check whether there are any sql joins present in the query. Note, | ||
// however, that it is doable to detect a presence of only *ToOne joins via the access to | ||
// joined entity classes' metadata (see: QueryBuilder::getEntityManager()->getClassMetadata(className)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performing this analysis on a QueryBuilder will be complex, as it has to account for joins done on previous joins and not on the root entity, replicating work already done by the query parser as part of the AST.
$countQuery->setHint(CountWalker::HINT_DISTINCT, true); | ||
$hintDistinctDefaultTrue = true; | ||
|
||
// When not joining onto *ToMany relations, then use a simpler COUNT query in the CountWalker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that there is a case where joined collections that are not selected will not impact the count or the iteration: when using SELECT DISTINCT
in the query (see SelectClause::$isDistinct
)
public static function newWithAutoDetection(QueryBuilder $queryBuilder): self | ||
{ | ||
/** @var array<string, Join[]> $joinsPerRootAlias */ | ||
$joinsPerRootAlias = $queryBuilder->getDQLPart('join'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could accept a fast path introspecting the QueryBuilder instead of parsing the Query as an optimization if it has indeed an impact. However, it would have to support the same capabilities. See my other comment related to the detection of joining collections rather than any joins. Maybe the QueryBuilder introspection would not be viable for that.
However, my personal opinion would be that we should not merge this without support for Query as it would prevent downstream projects from migrating to it (for instance, the Pagerfanta DoctrineOrmAdapter uses this Paginator and currently supports Query|QueryBuilder
).
Note that this is only my own opinion and the ORM maintainers might take a different decision (I'm maintainer on DoctrineBundle but not on the ORM itself, even though I regularly help with reviews here as well)
Hi, While I wait for some final answers from stof to my (regrettably) wall of text, may I know your initial thoughts on my proposition to rename some instance variables in the Paginator class, please? I explained it in more details in the discussion above, but the bottom line is:
Would you say that you would initially allow for the changes described above to be made to the Paginator by me? Thanks. |
|
May I ask for an initial review of the code in this PR before I move forward with it, please?
This PR implements the scope of work discussed in #8278 (comment). It only implements the "no joins used" case. It doesn't implement the "only ToOne joins used" case. I plan do implement that other case in a future PR, after this one gets reviewed and hopefully merged.
A few comments:
$fetchJoinCollection = null
). This means that there could be a long period of time before the "auto-detection of better defaults" gets enabled by default (if ever).Thank you.