Fix issue 195 "leading hint + join methods hint cannot totally force the join order" #207

HennyNile · 2024-11-12T11:14:14Z

Hi! I am excited to share with you that I have solved #195. Before I write code to solve the problem, I have discussed it with others in #195. As a reminder, I will briefly introduce the problem, the reason, the solution, and the limitations of the solution.

Problem Description

The problem is that PostgreSQL with pg_hint_plan generates an execution plan inconsistent with the input hints. In particular, for the following query

/*+
Leading((rt (it ((n (chn (mc (mi (t (ci an)))))) cn))))
HashJoin(ci an)
NestLoop(ci an t)
NestLoop(ci an t mi)
NestLoop(ci an t mi mc)
NestLoop(ci an t mi mc chn)
NestLoop(ci an t mi mc chn n)
NestLoop(ci an t mi mc chn n cn)
NestLoop(ci an t mi mc chn n cn it)
NestLoop(ci an t mi mc chn n cn it rt)
*/
EXPLAIN (FORMAT TEXT)
SELECT MIN(n.name) AS voicing_actress,
       MIN(t.title) AS voiced_movie
FROM aka_name AS an,
     char_name AS chn,
     cast_info AS ci,
     company_name AS cn,
     info_type AS it,
     movie_companies AS mc,
     movie_info AS mi,
     name AS n,
     role_type AS rt,
     title AS t
WHERE ci.note IN ('(voice)',
                  '(voice: Japanese version)',
                  '(voice) (uncredited)',
                  '(voice: English version)')
  AND cn.country_code ='[us]'
  AND it.info = 'release dates'
  AND mc.note IS NOT NULL
  AND (mc.note LIKE '%(USA)%'
       OR mc.note LIKE '%(worldwide)%')
  AND mi.info IS NOT NULL
  AND (mi.info LIKE 'Japan:%200%'
       OR mi.info LIKE 'USA:%200%')
  AND n.gender ='f'
  AND n.name LIKE '%Ang%'
  AND rt.role ='actress'
  AND t.production_year BETWEEN 2005 AND 2009
  AND t.id = mi.movie_id
  AND t.id = mc.movie_id
  AND t.id = ci.movie_id
  AND mc.movie_id = ci.movie_id
  AND mc.movie_id = mi.movie_id
  AND mi.movie_id = ci.movie_id
  AND cn.id = mc.company_id
  AND it.id = mi.info_type_id
  AND n.id = ci.person_id
  AND rt.id = ci.role_id
  AND n.id = an.person_id
  AND ci.person_id = an.person_id
  AND chn.id = ci.person_role_id;

The generated execution plan (reimplemented in my server) is

 Aggregate  (cost=27678801700.75..27678801700.76 rows=1 width=64)
   ->  Nested Loop  (cost=20000031697.50..27678801700.75 rows=1 width=32)
         Join Filter: (rt.id = ci.role_id)
         ->  Seq Scan on role_type rt  (cost=0.00..18.88 rows=4 width=4)
               Filter: ((role)::text = 'actress'::text)
         ->  Materialize  (cost=20000031697.50..27678801681.81 rows=1 width=36)
               ->  Nested Loop  (cost=20000031697.50..27678801681.81 rows=1 width=36)
                     Join Filter: (it.id = mi.info_type_id)
                     ->  Seq Scan on info_type it  (cost=0.00..2.41 rows=1 width=4)
                           Filter: ((info)::text = 'release dates'::text)
                     ->  Nested Loop  (cost=20000031697.50..27678801678.66 rows=59 width=40)
                           ->  Nested Loop  (cost=20000031697.08..27678801482.37 rows=163 width=44)
                                 Join Filter: (n.id = ci.person_id)
                                 ->  Seq Scan on name n  (cost=0.00..118171.96 rows=9516 width=19)
                                       Filter: ((name ~~ '%Ang%'::text) AND ((gender)::text = 'f'::text))
                                 ->  Materialize  (cost=20000031697.08..27663195186.54 rows=71311 width=37)
                                       ->  Nested Loop  (cost=20000031697.08..27663194271.99 rows=71311 width=37)
                                             ->  Nested Loop  (cost=10000031696.65..17663060878.15 rows=145979 width=41)
                                                   Join Filter: (t.id = mc.movie_id)
                                                   ->  Seq Scan on movie_companies mc  (cost=0.00..57960.93 rows=307745 width=8)
                                                         Filter: ((note IS NOT NULL) AND ((note ~~ '%(USA)%'::text) OR (note ~~ '%(worldwide)%'::text)))
                                                   ->  Materialize  (cost=10000031696.65..15817340640.79 rows=242159 width=49)
                                                         ->  Nested Loop  (cost=10000031696.65..15817337065.00 rows=242159 width=49)
                                                               Join Filter: (t.id = mi.movie_id)
                                                               ->  Seq Scan on movie_info mi  (cost=0.00..382516.23 rows=543793 width=8)
                                                                     Filter: ((info ~~ 'Japan:%200%'::text) OR (info ~~ 'USA:%200%'::text))
                                                               ->  Materialize  (cost=10000031696.65..10003749915.92 rows=449341 width=41)
                                                                     ->  Nested Loop  (cost=10000031696.65..10003743719.21 rows=449341 width=41)
                                                                           ->  Hash Join  (cost=31696.22..852385.40 rows=2010555 width=20)
                                                                                 Hash Cond: (ci.person_id = an.person_id)
                                                                                 ->  Seq Scan on cast_info ci  (cost=0.00..796439.28 rows=828870 width=16)
                                                                                       Filter: (note = ANY ('{(voice),"(voice: Japanese version)","(voice) (uncredited)",
"(voice: English version)"}'::text[]))
                                                                                 ->  Hash  (cost=20429.43..20429.43 rows=901343 width=4)
                                                                                       ->  Seq Scan on aka_name an  (cost=0.00..20429.43 rows=901343 width=4)
                                                                           ->  Index Scan using title_pkey on title t  (cost=0.43..1.44 rows=1 width=21)
                                                                                 Index Cond: (id = ci.movie_id)
                                                                                 Filter: ((production_year >= 2005) AND (production_year <= 2009))
                                             ->  Index Only Scan using char_name_pkey on char_name chn  (cost=0.43..0.91 rows=1 width=4)
                                                   Index Cond: (id = ci.person_role_id)
                           ->  Index Scan using company_name_pkey on company_name cn  (cost=0.42..1.20 rows=1 width=4)
                                 Index Cond: (id = mc.company_id)
                                 Filter: ((country_code)::text = '[us]'::text)

The actual join order is inconsistent with the input join order hint, both table t and chn are not following the leading hint.:

Actual join order:
(rt (it ((n ((mc (mi ((ci an) t))) chn)) cn)))

hint join order:
(rt (it ((n (chn (mc (mi (t (ci an)))))) cn)))

Reason and Solution

There are two reasons for this issue.

1. PostgreSQL does not include disable_cost for disabled operator.

PostgreSQL with pg_hint_plan supports disabling certain operators (e.g., hash join, seq scan) by setting pg parameters like “set enable_hashjoin = false”. This setting causes PostgreSQL to add a high disable_cost (e.g., 1e10) to the estimated cost of the hash join operator, effectively preventing the planner from selecting hash joins due to the inflated cost. Additionally, pg_hint_plan supports enforcing specific join orders. To do this, pg_hint_plan disables all join algorithms when it encounters inconsistent join orders, by adding the disable_cost to each join operator. As a result, only the assigned join order will be selected. This is the mechanism behind pg_hint_plan.

In the given example, the hint specifies a join order (rt (it ((n (chn (mc (mi (t (ci an)))))) cn))), but the generated join order is (rt (it ((n ((mc (mi ((ci an) t))) chn)) cn))). Here, PostgreSQL generates sub-join order ((ci an) t) instead of the assigned sub-join order (t (ci an)), and ((mc (mi ((ci an) t))) chn) instead of (chn (mc (mi ((ci an) t)))). This discrepancy arises because PostgreSQL estimates operator costs in two phases. In the first phase, it filters out paths that are obviously suboptimal based on estimated costs. However, it does not include disable_cost for disabled operators in this phase, only doing so in the second phase. While (t (ci an)) would use a regular nested loop join, ((ci an) t) uses an index-based nested loop join with an index scan on t, which is significantly faster. Consequently, (t (ci an)) is filtered out after the first phase of cost estimation. The same reasoning applies to (chn (mc (mi ((ci an) t)))).

To solve this problem, we could simply include diabled_cost(set to 1e10) in the first phase of cost estimation for disabled operators.

2. disable_cost(set to 1e10) defined in PG is not large enough.

After applying the modifications introduced in section 1, we generated the following plan:

Aggregate  (cost=40566110422.19..40566110422.20 rows=1 width=64)
   ->  Nested Loop  (cost=20000031697.94..40566110422.19 rows=1 width=32)
         Join Filter: (ci.role_id = rt.id)
         ->  Seq Scan on role_type rt  (cost=0.00..1.15 rows=1 width=4)
               Filter: ((role)::text = 'actress'::text)
         ->  Nested Loop  (cost=20000031697.94..40566110421.02 rows=1 width=36)
               Join Filter: (it.id = mi.info_type_id)
               ->  Seq Scan on info_type it  (cost=0.00..2.41 rows=1 width=4)
                     Filter: ((info)::text = 'release dates'::text)
               ->  Nested Loop  (cost=20000031697.94..40566110418.60 rows=1 width=40)
                     ->  Nested Loop  (cost=20000031697.52..40566110416.19 rows=2 width=44)
                           Join Filter: (ci.person_id = n.id)
                           ->  Seq Scan on name n  (cost=0.00..118169.85 rows=97 width=19)
                                 Filter: ((name ~~ '%Ang%'::text) AND ((gender)::text = 'f'::text))
                           ->  Materialize  (cost=20000031697.52..40565817300.39 rows=79401 width=37)
                                 ->  Nested Loop  (cost=20000031697.52..40565816282.38 rows=79401 width=37)
                                       Join Filter: (chn.id = ci.person_role_id)
                                       ->  Seq Scan on char_name chn  (cost=0.00..67851.60 rows=3140360 width=4)
                                       ->  Materialize  (cost=20000031697.52..28190423736.30 rows=165649 width=41)
                                             ->  Nested Loop  (cost=20000031697.52..28190421452.06 rows=165649 width=41)
                                                   Join Filter: (mc.movie_id = t.id)
                                                   ->  Index Scan using company_id_movie_companies on movie_companies mc  (cost=0.43..751429.90 rows=293417 width=8)
                                                         Filter: ((note IS NOT NULL) AND ((note ~~ '%(USA)%'::text) OR (note ~~ '%(worldwide)%'::text)))
                                                   ->  Materialize  (cost=20000031697.09..26202512612.58 rows=273432 width=49)
                                                         ->  Nested Loop  (cost=20000031697.09..26202508574.42 rows=273432 width=49)
                                                               Join Filter: (mi.movie_id = t.id)
                                                               ->  Index Scan using info_type_id_movie_info on movie_info mi  (cost=0.43..7147042.20 rows=541004 width=8)
                                                                     Filter: ((info IS NOT NULL) AND ((info ~~ 'Japan:%200%'::text) OR (info ~~ 'USA:%200%'::text)))
                                                               ->  Materialize  (cost=20000031696.66..20003581593.93 rows=481066 width=41)
                                                                     ->  Nested Loop  (cost=20000031696.66..20003574959.60 rows=481066 width=41)
                                                                           ->  Hash Join  (cost=31696.22..1052256.92 rows=2143262 width=20)
                                                                                 Hash Cond: (ci.person_id = an.person_id)
                                                                                 ->  Seq Scan on cast_info ci  (cost=0.00..796914.48 rows=888851 width=16)
                                                                                       Filter: (note = ANY ('{(voice),"(voice: Japanese version)","(voice) (uncredited)","(voice: English version)"}'::text[]))
                                                                                 ->  Hash  (cost=20429.43..20429.43 rows=901343 width=4)
                                                                                       ->  Seq Scan on aka_name an  (cost=0.00..20429.43 rows=901343 width=4)
                                                                           ->  Memoize  (cost=0.44..1.44 rows=1 width=21)
                                                                                 Cache Key: ci.movie_id
                                                                                 Cache Mode: logical
                                                                                 ->  Index Scan using title_pkey on title t  (cost=0.43..1.43 rows=1 width=21)
                                                                                       Index Cond: (id = ci.movie_id)
                                                                                       Filter: ((production_year >= 2005) AND (production_year <= 2009))
                     ->  Index Scan using company_name_pkey on company_name cn  (cost=0.42..1.21 rows=1 width=4)
                           Index Cond: (id = mc.company_id)
                           Filter: ((country_code)::text = '[us]'::text)

The actual join order is still not inconsistent with the input join order hint, where table chn follows the leading hint, but the table t still does not.:

Actual join order:
(rt (it ((n (chn (mc (mi ((ci an) t))))) cn)))

hint join order:
(rt (it ((n (chn (mc (mi (t (ci an)))))) cn)))

By examining the cost of desired subplan Nestloop(t Hashjoin(ci an)), I found its cost estimate is 25367763419.74, which is even larger than disable_cost (set to 1e10). This cost estimate of Nestloop(t Hashjoin(ci an)) could be approximated as row(t)*cost(Hashjoin(ci an)) = 567391 * 777302.92 = 44e10. A horrible assigned plan may has larger cost estimate than disable_cost, then the diabled join order may has less cost estimates and would be selected as the final plan.

To solve this problem, I use pg_hint_diable_cost (set to 1e20) to replace the diable_cost in PG.

With these two modifications, we get the desired plans:

Aggregate  (cost=45930298882.34..45930298882.35 rows=1 width=64)
   ->  Nested Loop  (cost=20867.81..45930298882.33 rows=1 width=32)
         Join Filter: (ci.role_id = rt.id)
         ->  Seq Scan on role_type rt  (cost=0.00..1.15 rows=1 width=4)
               Filter: ((role)::text = 'actress'::text)
         ->  Nested Loop  (cost=20867.81..45930298881.17 rows=1 width=36)
               Join Filter: (it.id = mi.info_type_id)
               ->  Seq Scan on info_type it  (cost=0.00..2.41 rows=1 width=4)
                     Filter: ((info)::text = 'release dates'::text)
               ->  Nested Loop  (cost=20867.81..45930298878.75 rows=1 width=40)
                     ->  Nested Loop  (cost=20867.39..45930298876.33 rows=2 width=44)
                           Join Filter: (ci.person_id = n.id)
                           ->  Seq Scan on name n  (cost=0.00..118169.85 rows=97 width=19)
                                 Filter: ((name ~~ '%Ang%'::text) AND ((gender)::text = 'f'::text))
                           ->  Materialize  (cost=20867.39..45930005760.53 rows=79401 width=37)
                                 ->  Nested Loop  (cost=20867.39..45930004742.53 rows=79401 width=37)
                                       Join Filter: (chn.id = ci.person_role_id)
                                       ->  Seq Scan on char_name chn  (cost=0.00..67851.60 rows=3140360 width=4)
                                       ->  Materialize  (cost=20867.39..33554612196.45 rows=165649 width=41)
                                             ->  Nested Loop  (cost=20867.39..33554609912.20 rows=165649 width=41)
                                                   Join Filter: (mc.movie_id = t.id)
                                                   ->  Index Scan using company_id_movie_companies on movie_companies mc  (cost=0.43..751429.90 rows=293417 width=8)
                                                         Filter: ((note IS NOT NULL) AND ((note ~~ '%(USA)%'::text) OR (note ~~ '%(worldwide)%'::text)))
                                                   ->  Materialize  (cost=20866.96..31566701072.72 rows=273432 width=49)
                                                         ->  Nested Loop  (cost=20866.96..31566697034.56 rows=273432 width=49)
                                                               Join Filter: (mi.movie_id = t.id)
                                                               ->  Index Scan using info_type_id_movie_info on movie_info mi  (cost=0.43..7147042.20 rows=541004 width=8)
                                                                     Filter: ((info IS NOT NULL) AND ((info ~~ 'Japan:%200%'::text) OR (info ~~ 'USA:%200%'::text)))
                                                               ->  Materialize  (cost=20866.53..25367770054.07 rows=481066 width=41)
                                                                     ->  Nested Loop  (cost=20866.53..25367763419.74 rows=481066 width=41)
                                                                           Join Filter: (ci.movie_id = t.id)
                                                                           ->  Index Scan using title_pkey on title t  (cost=0.43..126923.34 rows=567391 width=21)
                                                                                 Filter: ((production_year >= 2005) AND (production_year <= 2009))
                                                                           ->  Materialize  (cost=20866.10..777302.92 rows=2143262 width=20)
                                                                                 ->  Gather  (cost=20866.10..754027.61 rows=2143262 width=20)
                                                                                       Workers Planned: 2
                                                                                       ->  Parallel Hash Join  (cost=19866.10..538701.41 rows=893026 width=20)
                                                                                             Hash Cond: (ci.person_id = an.person_id)
                                                                                             ->  Parallel Seq Scan on cast_info ci  (cost=0.00..479467.70 rows=370355 width=16)
                                                                                                   Filter: (note = ANY ('{(voice),"(voice: Japanese version)","(voice) (uncredited)","(voice: English version)"}'::text[]))
                                                                                             ->  Parallel Hash  (cost=15171.60..15171.60 rows=375560 width=4)
                                                                                                   ->  Parallel Seq Scan on aka_name an  (cost=0.00..15171.60 rows=375560 width=4)
                     ->  Index Scan using company_name_pkey on company_name cn  (cost=0.42..1.21 rows=1 width=4)
                           Index Cond: (id = mc.company_id)
                           Filter: ((country_code)::text = '[us]'::text)

Limitations of this Solution

There two main limitaions of this solution:
(1) There is no hook for cost estimation functions in PostgreSQL, then I copied a lot of routine functions to modify the cost estimation functions.
(2) There may be some plans with large estimated cost than 1e20 (I think it's not possible for the centric database), in this case, the assigned plan wil be not selected.

I implement this solution in PG16 and the extension to other PG version is not hard.

Hope your reply!

michaelpq · 2024-11-12T23:53:34Z

In order to prove your point, I would suggest to add some tests. We don't accept any new code without that. Be aware of code coverage, as well. This is not a simple patch you are suggesting.

HennyNile · 2024-11-13T04:35:03Z

I see. I will write tests for this patch following existing tests as soon as possible.

HennyNile · 2024-11-13T16:07:58Z

@michaelpq Hi, there is a problem to decide a appropriate value for pg_hint_disable_cost which I mentioned before.

Float has approximately 15-17 decimal digits of precision. If we set pg_hint_disable_cost to 1e20, for two disabled paths with costs less than 1000 (most scan operators have cost than 1000), the comparison of them will get the result that they have the same cost even their costs are different.

This problem will make test pg_hint_plan fail and the largest available value for pg_hint_disable_cost is 1e11. While the initial diable_cost is 1e10 and is smaller than some costs of practical plans, I think 1e11 is also not large enough for pg_hint_plan.

How do you think about this problem? I think only adjusting the value of disable_cost is not enough.

Fix issue 195

91c28aa

HennyNile mentioned this pull request Nov 12, 2024

leading hint + join methods hint cannot totally force the join order. #195

Open

HennyNile mentioned this pull request Nov 14, 2024

Enable all join operators for all joins for n rels if there is no join hint for n rels #208

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issue 195 "leading hint + join methods hint cannot totally force the join order" #207

Fix issue 195 "leading hint + join methods hint cannot totally force the join order" #207

HennyNile commented Nov 12, 2024

michaelpq commented Nov 12, 2024

HennyNile commented Nov 13, 2024

HennyNile commented Nov 13, 2024

Fix issue 195 "leading hint + join methods hint cannot totally force the join order" #207

Are you sure you want to change the base?

Fix issue 195 "leading hint + join methods hint cannot totally force the join order" #207

Conversation

HennyNile commented Nov 12, 2024

Problem Description

Reason and Solution

1. PostgreSQL does not include disable_cost for disabled operator.

2. disable_cost(set to 1e10) defined in PG is not large enough.

Limitations of this Solution

michaelpq commented Nov 12, 2024

HennyNile commented Nov 13, 2024

HennyNile commented Nov 13, 2024