Skip to content

Commit

Permalink
Remove dependency to shell scripts for EXPLAIN output filtering
Browse files Browse the repository at this point in the history
pg_hint_plan has depended for a long time on a set of non-portable shell
scripts to filter the output of the plans of any unstable output, like
costs or widths.  This had the disadvantage to be usable only on Linux,
while depending on \o and temporary output files.

This is replaced in this commit by a solution closer to PostgreSQL
upstream, where we use a PL/pgSQL function to process the EXPLAIN
queries whose output need to be stabilized.  The style used in this
commit may arguably be improved more in the future, but the changes done
here make the diffs more pallatable than anything I have considered,
with all the plans generated remaining the same.

Some queries that included quotes in ut-R required a couple more quotes
to work in the filtering function.  Some extra CONTEXT messages coming
from the filtering function are generated, as well as some extra LOG
messages for cases related unused indexes, but let's live with that for
now.

Author: Yogesh Sharma, Michael Paquier
Backpatch-through: 17

Per pull request #198 and issue #181.
  • Loading branch information
michaelpq committed Aug 20, 2024
1 parent 7757374 commit 63cf84e
Show file tree
Hide file tree
Showing 17 changed files with 1,301 additions and 1,299 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ STARBALLS = $(STARBALL17)
TARSOURCES = Makefile *.c *.h COPYRIGHT* \
pg_hint_plan--*.sql \
pg_hint_plan.control \
docs/* expected/*.out sql/*.sql sql/maskout*.sh \
docs/* expected/*.out sql/*.sql \
data/data.csv SPECS/*.spec

rpms: rpm17
Expand Down
29 changes: 29 additions & 0 deletions expected/init.out
Original file line number Diff line number Diff line change
Expand Up @@ -229,4 +229,33 @@ SELECT * FROM settings;
enable_tidscan | on | Query Tuning / Planner Method Configuration
(51 rows)

-- EXPLAIN filtering
--
-- A lot of tests rely on EXPLAIN being executed with costs enabled
-- to check the validity of the plans generated with hints.
--
-- This function takes in input a query, executes it and applies some
-- filtering to ensure a stable output. See the tests calling this
-- function to see how it can be used.
--
-- If required, this can be extended with new operation modes.
CREATE OR REPLACE FUNCTION explain_filter(text) RETURNS SETOF text
LANGUAGE plpgsql AS
$$
DECLARE
ln text;
BEGIN
FOR ln IN EXECUTE $1
LOOP
-- Replace cost values with some 'xxx'
ln := regexp_replace(ln, 'cost=10{7}[.0-9]+ ', 'cost={inf}..{inf} ');
ln := regexp_replace(ln, 'cost=[.0-9]+ ', 'cost=xxx..xxx ');
-- Replace width with some 'xxx'
ln := regexp_replace(ln, 'width=[0-9]+([^0-9])', 'width=xxx\1');
-- Filter foreign files
ln := regexp_replace(ln, '^( +Foreign File: ).*$', '\1 (snip..)');
return next ln;
END LOOP;
END;
$$;
ANALYZE;
135 changes: 73 additions & 62 deletions expected/pg_hint_plan.out
Original file line number Diff line number Diff line change
Expand Up @@ -8898,182 +8898,192 @@ Rows()
(1 row)

-- value types
\o results/pg_hint_plan.tmpout
SELECT explain_filter('
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id);
\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
');
explain_filter
----------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=1000 width=xxx)
Merge Cond: (t1.id = t2.id)
-> Index Scan using t1_pkey on t1 (cost=xxx..xxx rows=10000 width=xxx)
-> Index Scan using t2_pkey on t2 (cost=xxx..xxx rows=1000 width=xxx)
(4 rows)

\o results/pg_hint_plan.tmpout
SELECT explain_filter('
/*+ Rows(t1 t2 #99) */
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id);
');
LOG: pg_hint_plan:
used hint:
Rows(t1 t2 #99)
not used hint:
duplication hint:
error hint:

\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
explain_filter
----------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=99 width=xxx)
Merge Cond: (t1.id = t2.id)
-> Index Scan using t1_pkey on t1 (cost=xxx..xxx rows=10000 width=xxx)
-> Index Scan using t2_pkey on t2 (cost=xxx..xxx rows=1000 width=xxx)
(4 rows)

\o results/pg_hint_plan.tmpout
SELECT explain_filter('
/*+ Rows(t1 t2 +99) */
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id);
');
LOG: pg_hint_plan:
used hint:
Rows(t1 t2 +99)
not used hint:
duplication hint:
error hint:

\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
explain_filter
----------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=1099 width=xxx)
Merge Cond: (t1.id = t2.id)
-> Index Scan using t1_pkey on t1 (cost=xxx..xxx rows=10000 width=xxx)
-> Index Scan using t2_pkey on t2 (cost=xxx..xxx rows=1000 width=xxx)
(4 rows)

\o results/pg_hint_plan.tmpout
SELECT explain_filter('
/*+ Rows(t1 t2 -99) */
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id);
');
LOG: pg_hint_plan:
used hint:
Rows(t1 t2 -99)
not used hint:
duplication hint:
error hint:

\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
explain_filter
----------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=901 width=xxx)
Merge Cond: (t1.id = t2.id)
-> Index Scan using t1_pkey on t1 (cost=xxx..xxx rows=10000 width=xxx)
-> Index Scan using t2_pkey on t2 (cost=xxx..xxx rows=1000 width=xxx)
(4 rows)

\o results/pg_hint_plan.tmpout
SELECT explain_filter('
/*+ Rows(t1 t2 *99) */
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id);
');
LOG: pg_hint_plan:
used hint:
Rows(t1 t2 *99)
not used hint:
duplication hint:
error hint:

\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
explain_filter
----------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=99000 width=xxx)
Merge Cond: (t1.id = t2.id)
-> Index Scan using t1_pkey on t1 (cost=xxx..xxx rows=10000 width=xxx)
-> Index Scan using t2_pkey on t2 (cost=xxx..xxx rows=1000 width=xxx)
(4 rows)

\o results/pg_hint_plan.tmpout
SELECT explain_filter('
/*+ Rows(t1 t2 *0.01) */
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id);
');
LOG: pg_hint_plan:
used hint:
Rows(t1 t2 *0.01)
not used hint:
duplication hint:
error hint:

\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
explain_filter
----------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=10 width=xxx)
Merge Cond: (t1.id = t2.id)
-> Index Scan using t1_pkey on t1 (cost=xxx..xxx rows=10000 width=xxx)
-> Index Scan using t2_pkey on t2 (cost=xxx..xxx rows=1000 width=xxx)
(4 rows)

\o results/pg_hint_plan.tmpout
SELECT explain_filter('
/*+ Rows(t1 t2 #aa) */
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id); -- ERROR
');
INFO: pg_hint_plan: hint syntax error at or near "aa"
DETAIL: Rows hint requires valid number as rows estimation.
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
LOG: pg_hint_plan:
used hint:
not used hint:
duplication hint:
error hint:
Rows(t1 t2 #aa)

\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
explain_filter
----------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=1000 width=xxx)
Merge Cond: (t1.id = t2.id)
-> Index Scan using t1_pkey on t1 (cost=xxx..xxx rows=10000 width=xxx)
-> Index Scan using t2_pkey on t2 (cost=xxx..xxx rows=1000 width=xxx)
(4 rows)

\o results/pg_hint_plan.tmpout
SELECT explain_filter('
/*+ Rows(t1 t2 /99) */
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id); -- ERROR
');
INFO: pg_hint_plan: hint syntax error at or near "/99"
DETAIL: Unrecognized rows value type notation.
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
LOG: pg_hint_plan:
used hint:
not used hint:
duplication hint:
error hint:
Rows(t1 t2 /99)

\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
explain_filter
----------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=1000 width=xxx)
Merge Cond: (t1.id = t2.id)
-> Index Scan using t1_pkey on t1 (cost=xxx..xxx rows=10000 width=xxx)
-> Index Scan using t2_pkey on t2 (cost=xxx..xxx rows=1000 width=xxx)
(4 rows)

-- round up to 1
\o results/pg_hint_plan.tmpout
SELECT explain_filter('
/*+ Rows(t1 t2 -99999) */
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id);
');
WARNING: Force estimate to be at least one row, to avoid possible divide-by-zero when interpolating costs : Rows(t1 t2 -99999)
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
LOG: pg_hint_plan:
used hint:
Rows(t1 t2 -99999)
not used hint:
duplication hint:
error hint:

\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
explain_filter
----------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=1 width=xxx)
Merge Cond: (t1.id = t2.id)
-> Index Scan using t1_pkey on t1 (cost=xxx..xxx rows=10000 width=xxx)
-> Index Scan using t2_pkey on t2 (cost=xxx..xxx rows=1000 width=xxx)
(4 rows)

-- complex join tree
\o results/pg_hint_plan.tmpout
SELECT explain_filter('
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id) JOIN t3 ON (t3.id = t2.id);
\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
');
explain_filter
----------------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=10 width=xxx)
Merge Cond: (t1.id = t3.id)
-> Merge Join (cost=xxx..xxx rows=1000 width=xxx)
Expand All @@ -9083,21 +9093,22 @@ EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id) JOIN t3 ON (t3.id = t2.id);
-> Sort (cost=xxx..xxx rows=100 width=xxx)
Sort Key: t3.id
-> Seq Scan on t3 (cost=xxx..xxx rows=100 width=xxx)
(9 rows)

\o results/pg_hint_plan.tmpout
SELECT explain_filter('
/*+ Rows(t1 t2 #22) */
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id) JOIN t3 ON (t3.id = t2.id);
');
LOG: pg_hint_plan:
used hint:
Rows(t1 t2 #22)
not used hint:
duplication hint:
error hint:

\o
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
explain_filter
----------------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=1 width=xxx)
Merge Cond: (t1.id = t3.id)
-> Merge Join (cost=xxx..xxx rows=22 width=xxx)
Expand All @@ -9107,22 +9118,22 @@ error hint:
-> Sort (cost=xxx..xxx rows=100 width=xxx)
Sort Key: t3.id
-> Seq Scan on t3 (cost=xxx..xxx rows=100 width=xxx)
(9 rows)

\o results/pg_hint_plan.tmpout
SELECT explain_filter('
/*+ Rows(t1 t3 *10) */
EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.id = t2.id) JOIN t3 ON (t3.id = t2.id);
');
LOG: pg_hint_plan:
used hint:
Rows(t1 t3 *10)
not used hint:
duplication hint:
error hint:

\o
set max_parallel_workers_per_gather to DEFAULT;
\! sql/maskout.sh results/pg_hint_plan.tmpout
QUERY PLAN
----------------
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
explain_filter
----------------------------------------------------------------------------------
Merge Join (cost=xxx..xxx rows=100 width=xxx)
Merge Cond: (t1.id = t3.id)
-> Merge Join (cost=xxx..xxx rows=1000 width=xxx)
Expand All @@ -9132,8 +9143,8 @@ set max_parallel_workers_per_gather to DEFAULT;
-> Sort (cost=xxx..xxx rows=100 width=xxx)
Sort Key: t3.id
-> Seq Scan on t3 (cost=xxx..xxx rows=100 width=xxx)
(9 rows)

\! rm results/pg_hint_plan.tmpout
-- Query with join RTE and outer-join relids
/*+Leading(ft_1 ft_2 t1)*/
SELECT relname, seq_scan > 0 AS seq_scan, idx_scan > 0 AS idx_scan
Expand Down
Loading

0 comments on commit 63cf84e

Please sign in to comment.