Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

796 lines
16 KiB
Plaintext
Raw Permalink Normal View History

--
-- SELECT statements
--
CREATE EXTENSION pg_stat_statements;
SET pg_stat_statements.track_utility = FALSE;
SET pg_stat_statements.track_planning = TRUE;
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
---
t
(1 row)
--
-- simple and compound statements
--
SELECT 1 AS "int";
int
-----
1
(1 row)
-- LIMIT and OFFSET patterns
-- Try some query permutations which once produced identical query IDs
SELECT 1 AS "int" LIMIT 1;
int
-----
1
(1 row)
SELECT 1 AS "int" LIMIT 2;
int
-----
1
(1 row)
SELECT 1 AS "int" OFFSET 1;
int
-----
(0 rows)
SELECT 1 AS "int" OFFSET 2;
int
-----
(0 rows)
SELECT 1 AS "int" OFFSET 1 LIMIT 1;
int
-----
(0 rows)
SELECT 1 AS "int" OFFSET 2 LIMIT 2;
int
-----
(0 rows)
SELECT 1 AS "int" LIMIT 1 OFFSET 1;
int
-----
(0 rows)
SELECT 1 AS "int" LIMIT 3 OFFSET 3;
int
-----
(0 rows)
SELECT 1 AS "int" OFFSET 1 FETCH FIRST 2 ROW ONLY;
int
-----
(0 rows)
SELECT 1 AS "int" OFFSET 2 FETCH FIRST 3 ROW ONLY;
int
-----
(0 rows)
-- DISTINCT and ORDER BY patterns
-- Try some query permutations which once produced identical query IDs
SELECT DISTINCT 1 AS "int";
int
-----
1
(1 row)
SELECT DISTINCT 2 AS "int";
int
-----
2
(1 row)
SELECT 1 AS "int" ORDER BY 1;
int
-----
1
(1 row)
SELECT 2 AS "int" ORDER BY 1;
int
-----
2
(1 row)
Improve parser's reporting of statement start locations. Up to now, the parser's reporting of a statement's stmt_location included any preceding whitespace or comments. This isn't really desirable but was done to avoid accounting honestly for nonterminals that reduce to empty. It causes problems for pg_stat_statements, which partially compensates by manually stripping whitespace, but is not bright enough to strip /*-style comments. There will be more problems with an upcoming patch to improve reporting of errors in extension scripts, so it's time to do something about this. The thing we have to do to make it work right is to adjust YYLLOC_DEFAULT to scan the inputs of each production to find the first one that has a valid location (i.e., did not reduce to empty). In theory this adds a little bit of per-reduction overhead, but in practice it's negligible. I checked by measuring the time to run raw_parser() on the contents of information_schema.sql, and there was basically no change. Having done that, we can rely on any nonterminal that didn't reduce to completely empty to have a correct starting location, and we don't need the kluges the stmtmulti production formerly used. This should have a side benefit of allowing parse error reports to include an error position in some cases where they formerly failed to do so, due to trying to report the position of an empty nonterminal. I did not go looking for an example though. The one previously known case where that could happen (OptSchemaEltList) no longer needs the kluge it had; but I rather doubt that that was the only case. Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
2024-10-22 11:26:05 -04:00
/* this comment should not appear in the output */
SELECT 'hello'
Improve parser's reporting of statement start locations. Up to now, the parser's reporting of a statement's stmt_location included any preceding whitespace or comments. This isn't really desirable but was done to avoid accounting honestly for nonterminals that reduce to empty. It causes problems for pg_stat_statements, which partially compensates by manually stripping whitespace, but is not bright enough to strip /*-style comments. There will be more problems with an upcoming patch to improve reporting of errors in extension scripts, so it's time to do something about this. The thing we have to do to make it work right is to adjust YYLLOC_DEFAULT to scan the inputs of each production to find the first one that has a valid location (i.e., did not reduce to empty). In theory this adds a little bit of per-reduction overhead, but in practice it's negligible. I checked by measuring the time to run raw_parser() on the contents of information_schema.sql, and there was basically no change. Having done that, we can rely on any nonterminal that didn't reduce to completely empty to have a correct starting location, and we don't need the kluges the stmtmulti production formerly used. This should have a side benefit of allowing parse error reports to include an error position in some cases where they formerly failed to do so, due to trying to report the position of an empty nonterminal. I did not go looking for an example though. The one previously known case where that could happen (OptSchemaEltList) no longer needs the kluge it had; but I rather doubt that that was the only case. Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
2024-10-22 11:26:05 -04:00
-- but this one will appear
AS "text";
text
-------
hello
(1 row)
SELECT 'world' AS "text";
text
-------
world
(1 row)
-- transaction
BEGIN;
SELECT 1 AS "int";
int
-----
1
(1 row)
SELECT 'hello' AS "text";
text
-------
hello
(1 row)
COMMIT;
-- compound transaction
BEGIN \;
SELECT 2.0 AS "float" \;
SELECT 'world' AS "text" \;
COMMIT;
float
-------
2.0
(1 row)
text
-------
world
(1 row)
-- compound with empty statements and spurious leading spacing
\;\; SELECT 3 + 3 \;\;\; SELECT ' ' || ' !' \;\; SELECT 1 + 4 \;;
?column?
----------
6
(1 row)
?column?
----------
!
(1 row)
?column?
----------
5
(1 row)
-- non ;-terminated statements
SELECT 1 + 1 + 1 AS "add" \gset
SELECT :add + 1 + 1 AS "add" \;
SELECT :add + 1 + 1 AS "add" \gset
add
-----
5
(1 row)
-- set operator
SELECT 1 AS i UNION SELECT 2 ORDER BY i;
i
---
1
2
(2 rows)
-- ? operator
select '{"a":1, "b":2}'::jsonb ? 'b';
?column?
----------
t
(1 row)
-- cte
WITH t(f) AS (
VALUES (1.0), (2.0)
)
SELECT f FROM t ORDER BY f;
f
-----
1.0
2.0
(2 rows)
-- prepared statement with parameter
PREPARE pgss_test (int) AS SELECT $1, 'test' LIMIT 1;
EXECUTE pgss_test(1);
?column? | ?column?
----------+----------
1 | test
(1 row)
DEALLOCATE pgss_test;
SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
calls | rows | query
-------+------+------------------------------------------------------------------------------
Revert support for improved tracking of nested queries This commit reverts the two following commits: - 499edb09741b, track more precisely query locations for nested statements. - 06450c7b8c70, a follow-up fix of 499edb09741b with query locations. The test introduced in this commit is not reverted. This is proving useful to track a problem that only pgaudit was able to detect. These prove to have issues with the tracking of SELECT statements, when these use multiple parenthesis which is something supported by the grammar. Incorrect location and lengths are causing pg_stat_statements to become confused, failing its job in query normalization with potential out-of-bound writes because the location and the length may not match with what can be handled. A lot of the query patterns discussed when this issue was reported have no test coverage in the main regression test suite, or the recovery test 027_stream_regress.pl would have caught the problems as pg_stat_statements is loaded by the node running the regression tests. A first step would be to improve the test coverage to stress more the query normalization logic. A different portion of this work was done in 45e0ba30fc40, with the addition of tests for nested queries. These can be left in the tree. They are useful to track the way inner queries are currently tracked by PGSS with non-top-level entries, and will be useful when reconsidering in the future the work reverted here. Reported-by: Alexander Kozhemyakin <a.kozhemyakin@postgrespro.ru> Discussion: https://postgr.es/m/18947-cdd2668beffe02bf@postgresql.org
2025-06-12 10:08:55 +09:00
1 | 1 | PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3
4 | 4 | SELECT $1 +
Improve parser's reporting of statement start locations. Up to now, the parser's reporting of a statement's stmt_location included any preceding whitespace or comments. This isn't really desirable but was done to avoid accounting honestly for nonterminals that reduce to empty. It causes problems for pg_stat_statements, which partially compensates by manually stripping whitespace, but is not bright enough to strip /*-style comments. There will be more problems with an upcoming patch to improve reporting of errors in extension scripts, so it's time to do something about this. The thing we have to do to make it work right is to adjust YYLLOC_DEFAULT to scan the inputs of each production to find the first one that has a valid location (i.e., did not reduce to empty). In theory this adds a little bit of per-reduction overhead, but in practice it's negligible. I checked by measuring the time to run raw_parser() on the contents of information_schema.sql, and there was basically no change. Having done that, we can rely on any nonterminal that didn't reduce to completely empty to have a correct starting location, and we don't need the kluges the stmtmulti production formerly used. This should have a side benefit of allowing parse error reports to include an error position in some cases where they formerly failed to do so, due to trying to report the position of an empty nonterminal. I did not go looking for an example though. The one previously known case where that could happen (OptSchemaEltList) no longer needs the kluge it had; but I rather doubt that that was the only case. Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
2024-10-22 11:26:05 -04:00
| | -- but this one will appear +
| | AS "text"
2 | 2 | SELECT $1 + $2
3 | 3 | SELECT $1 + $2 + $3 AS "add"
1 | 1 | SELECT $1 AS "float"
2 | 2 | SELECT $1 AS "int"
2 | 2 | SELECT $1 AS "int" LIMIT $2
2 | 0 | SELECT $1 AS "int" OFFSET $2
6 | 0 | SELECT $1 AS "int" OFFSET $2 LIMIT $3
2 | 2 | SELECT $1 AS "int" ORDER BY 1
1 | 2 | SELECT $1 AS i UNION SELECT $2 ORDER BY i
1 | 1 | SELECT $1 || $2
2 | 2 | SELECT DISTINCT $1 AS "int"
0 | 0 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"
1 | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
1 | 2 | WITH t(f) AS ( +
| | VALUES ($1), ($2) +
| | ) +
| | SELECT f FROM t ORDER BY f
1 | 1 | select $1::jsonb ? $2
(17 rows)
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
---
t
(1 row)
pg_stat_statements: Fix parameter number gaps in normalized queries pg_stat_statements anticipates that certain constant locations may be recorded multiple times and attempts to avoid calculating a length for these locations in fill_in_constant_lengths(). However, during generate_normalized_query() where normalized query strings are generated, these locations are not excluded from consideration. This could increment the parameter number counter for every recorded occurrence at such a location, leading to an incorrect normalization in certain cases with gaps in the numbers reported. For example, take this query: SELECT WHERE '1' IN ('2'::int, '3'::int::text) Before this commit, it would be normalized like that, with gaps in the parameter numbers: SELECT WHERE $1 IN ($3::int, $4::int::text) However the correct, less confusing one should be like that: SELECT WHERE $1 IN ($2::int, $3::int::text) This commit fixes the computation of the parameter numbers to track the number of constants replaced with an $n by a separate counter instead of the iterator used to loop through the list of locations. The underlying query IDs are not changed, neither are the normalized strings for existing PGSS hash entries. New entries with fresh normalized queries would automatically get reshaped based on the new parameter numbering. Issue discovered while discussing a separate problem for HEAD, but this affects all the stable branches. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0tzxvWXsacGyxrixdhy3tTTDfJQqxyFBRFh31nNHBQ5qA@mail.gmail.com Backpatch-through: 13
2025-05-29 11:26:03 +09:00
-- normalization of constants and parameters, with constant locations
-- recorded one or more times.
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
---
t
(1 row)
SELECT WHERE '1' IN ('1'::int, '3'::int::text);
--
(1 row)
SELECT WHERE (1, 2) IN ((1, 2), (2, 3));
--
(1 row)
SELECT WHERE (3, 4) IN ((5, 6), (8, 7));
--
(0 rows)
SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
query | calls
------------------------------------------------------------------------+-------
SELECT WHERE $1 IN ($2::int, $3::int::text) | 1
SELECT WHERE ($1, $2) IN (($3, $4), ($5, $6)) | 2
SELECT pg_stat_statements_reset() IS NOT NULL AS t | 1
SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C" | 0
(4 rows)
-- with the last element being an explicit function call with an argument, ensure
-- the normalization of the squashing interval is correct.
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
---
t
(1 row)
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
---
t
(1 row)
SELECT WHERE 1 IN (1, int4(1), int4(2));
--
(1 row)
SELECT WHERE 1 = ANY (ARRAY[1, int4(1), int4(2)]);
--
(1 row)
SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
query | calls
------------------------------------------------------------------------+-------
SELECT WHERE $1 IN ($2 /*, ... */) | 2
SELECT pg_stat_statements_reset() IS NOT NULL AS t | 1
SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C" | 0
(3 rows)
--
-- queries with locking clauses
--
CREATE TABLE pgss_a (id integer PRIMARY KEY);
CREATE TABLE pgss_b (id integer PRIMARY KEY, a_id integer REFERENCES pgss_a);
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
---
t
(1 row)
-- control query
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id;
id | id | a_id
----+----+------
(0 rows)
-- test range tables
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE;
id | id | a_id
----+----+------
(0 rows)
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_a;
id | id | a_id
----+----+------
(0 rows)
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_b;
id | id | a_id
----+----+------
(0 rows)
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_a, pgss_b; -- matches plain "FOR UPDATE"
id | id | a_id
----+----+------
(0 rows)
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_b, pgss_a;
id | id | a_id
----+----+------
(0 rows)
-- test strengths
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR NO KEY UPDATE;
id | id | a_id
----+----+------
(0 rows)
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR SHARE;
id | id | a_id
----+----+------
(0 rows)
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR KEY SHARE;
id | id | a_id
----+----+------
(0 rows)
-- test wait policies
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE NOWAIT;
id | id | a_id
----+----+------
(0 rows)
SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE SKIP LOCKED;
id | id | a_id
----+----+------
(0 rows)
SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
calls | query
-------+------------------------------------------------------------------------------------------
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR KEY SHARE
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR NO KEY UPDATE
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR SHARE
2 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE NOWAIT
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_a
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_b
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE OF pgss_b, pgss_a
1 | SELECT * FROM pgss_a JOIN pgss_b ON pgss_b.a_id = pgss_a.id FOR UPDATE SKIP LOCKED
0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
(12 rows)
DROP TABLE pgss_a, pgss_b CASCADE;
--
-- access to pg_stat_statements_info view
--
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
---
t
(1 row)
SELECT dealloc FROM pg_stat_statements_info;
dealloc
---------
0
(1 row)
-- FROM [ONLY]
CREATE TABLE tbl_inh(id integer);
CREATE TABLE tbl_inh_1() INHERITS (tbl_inh);
INSERT INTO tbl_inh_1 SELECT 1;
SELECT * FROM tbl_inh;
id
----
1
(1 row)
SELECT * FROM ONLY tbl_inh;
id
----
(0 rows)
SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%FROM%tbl_inh%';
count
-------
2
(1 row)
-- WITH TIES
CREATE TABLE limitoption AS SELECT 0 AS val FROM generate_series(1, 10);
SELECT *
FROM limitoption
WHERE val < 2
ORDER BY val
FETCH FIRST 2 ROWS WITH TIES;
val
-----
0
0
0
0
0
0
0
0
0
0
(10 rows)
SELECT *
FROM limitoption
WHERE val < 2
ORDER BY val
FETCH FIRST 2 ROW ONLY;
val
-----
0
0
(2 rows)
SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%FETCH FIRST%';
count
-------
2
(1 row)
-- GROUP BY [DISTINCT]
SELECT a, b, c
FROM (VALUES (1, 2, 3), (4, NULL, 6), (7, 8, 9)) AS t (a, b, c)
GROUP BY ROLLUP(a, b), rollup(a, c)
ORDER BY a, b, c;
a | b | c
---+---+---
1 | 2 | 3
1 | 2 |
1 | 2 |
1 | | 3
1 | | 3
1 | |
1 | |
1 | |
4 | | 6
4 | | 6
4 | | 6
4 | |
4 | |
4 | |
4 | |
4 | |
7 | 8 | 9
7 | 8 |
7 | 8 |
7 | | 9
7 | | 9
7 | |
7 | |
7 | |
| |
(25 rows)
SELECT a, b, c
FROM (VALUES (1, 2, 3), (4, NULL, 6), (7, 8, 9)) AS t (a, b, c)
GROUP BY DISTINCT ROLLUP(a, b), rollup(a, c)
ORDER BY a, b, c;
a | b | c
---+---+---
1 | 2 | 3
1 | 2 |
1 | | 3
1 | |
4 | | 6
4 | | 6
4 | |
4 | |
7 | 8 | 9
7 | 8 |
7 | | 9
7 | |
| |
(13 rows)
SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%GROUP BY%ROLLUP%';
count
-------
2
(1 row)
-- GROUPING SET agglevelsup
SELECT (
SELECT (
SELECT GROUPING(a,b) FROM (VALUES (1)) v2(c)
) FROM (VALUES (1,2)) v1(a,b) GROUP BY (a,b)
) FROM (VALUES(6,7)) v3(e,f) GROUP BY ROLLUP(e,f);
grouping
----------
0
0
0
(3 rows)
SELECT (
SELECT (
SELECT GROUPING(e,f) FROM (VALUES (1)) v2(c)
) FROM (VALUES (1,2)) v1(a,b) GROUP BY (a,b)
) FROM (VALUES(6,7)) v3(e,f) GROUP BY ROLLUP(e,f);
grouping
----------
3
0
1
(3 rows)
SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%SELECT GROUPING%';
count
-------
2
(1 row)
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
---
t
(1 row)
-- Temporary table with same name, re-created.
BEGIN;
CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
SELECT * FROM temp_t;
id
----
(0 rows)
COMMIT;
BEGIN;
CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP;
SELECT * FROM temp_t;
id
----
(0 rows)
COMMIT;
SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
calls | query
-------+------------------------------------------------------------------------
Use relation name instead of OID in query jumbling for RangeTblEntry custom_query_jumble (introduced in 5ac462e2b7ac as a node field attribute) is now assigned to the expanded reference name "eref" of RangeTblEntry, adding in the query jumble computation the non-qualified aliased relation name, without the list of column names. The relation OID is removed from the query jumbling. The effects of this change can be seen in the tests added by 3430215fe35f, where pg_stat_statements (PGSS) entries are now grouped using the relation name, ignoring the relation search_path may point at. For example, these two relations are different, but are now grouped in a single PGSS entry as they are assigned the same query ID: CREATE TABLE foo1.tab (a int); CREATE TABLE foo2.tab (b int); SET search_path = 'foo1'; SELECT count(*) FROM tab; SET search_path = 'foo2'; SELECT count(*) FROM tab; SELECT count(*) FROM foo1.tab; SELECT count(*) FROM foo2.tab; SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab'; query | calls --------------------------+------- SELECT count(*) FROM tab | 4 (1 row) It is still possible to use an alias in the FROM clause to split these. This behavior is useful for relations re-created with the same name, where queries based on such relations would be grouped in the same PGSS entry. For permanent schemas, it should not really matter in practice. The main benefit is for workloads that use a lot of temporary relations, which are usually re-created with the same name continuously. These can be a heavy source of bloat in PGSS depending on the workload. Such entries can now be grouped together, improving the user experience. The original idea from Christoph Berg used catalog lookups to find temporary relations, something that the query jumble has never done, and it could cause some performance regressions. The idea to use RangeTblEntry.eref and the relation name, applying the same rules for all relations, temporary and not temporary, has been proposed by Tom Lane. The documentation additions have been suggested by Sami Imseih. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
2025-03-26 15:21:05 +09:00
2 | SELECT * FROM temp_t
0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
Use relation name instead of OID in query jumbling for RangeTblEntry custom_query_jumble (introduced in 5ac462e2b7ac as a node field attribute) is now assigned to the expanded reference name "eref" of RangeTblEntry, adding in the query jumble computation the non-qualified aliased relation name, without the list of column names. The relation OID is removed from the query jumbling. The effects of this change can be seen in the tests added by 3430215fe35f, where pg_stat_statements (PGSS) entries are now grouped using the relation name, ignoring the relation search_path may point at. For example, these two relations are different, but are now grouped in a single PGSS entry as they are assigned the same query ID: CREATE TABLE foo1.tab (a int); CREATE TABLE foo2.tab (b int); SET search_path = 'foo1'; SELECT count(*) FROM tab; SET search_path = 'foo2'; SELECT count(*) FROM tab; SELECT count(*) FROM foo1.tab; SELECT count(*) FROM foo2.tab; SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab'; query | calls --------------------------+------- SELECT count(*) FROM tab | 4 (1 row) It is still possible to use an alias in the FROM clause to split these. This behavior is useful for relations re-created with the same name, where queries based on such relations would be grouped in the same PGSS entry. For permanent schemas, it should not really matter in practice. The main benefit is for workloads that use a lot of temporary relations, which are usually re-created with the same name continuously. These can be a heavy source of bloat in PGSS depending on the workload. Such entries can now be grouped together, improving the user experience. The original idea from Christoph Berg used catalog lookups to find temporary relations, something that the query jumble has never done, and it could cause some performance regressions. The idea to use RangeTblEntry.eref and the relation name, applying the same rules for all relations, temporary and not temporary, has been proposed by Tom Lane. The documentation additions have been suggested by Sami Imseih. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
2025-03-26 15:21:05 +09:00
(3 rows)
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
---
t
(1 row)
-- search_path with various schemas and temporary tables
CREATE SCHEMA pgss_schema_1;
CREATE SCHEMA pgss_schema_2;
-- Same attributes.
CREATE TABLE pgss_schema_1.tab_search_same (a int, b int);
CREATE TABLE pgss_schema_2.tab_search_same (a int, b int);
CREATE TEMP TABLE tab_search_same (a int, b int);
-- Different number of attributes, mapping types
CREATE TABLE pgss_schema_1.tab_search_diff_1 (a int);
CREATE TABLE pgss_schema_2.tab_search_diff_1 (a int, b int);
CREATE TEMP TABLE tab_search_diff_1 (a int, b int, c int);
-- Same number of attributes, different types
CREATE TABLE pgss_schema_1.tab_search_diff_2 (a int);
CREATE TABLE pgss_schema_2.tab_search_diff_2 (a text);
CREATE TEMP TABLE tab_search_diff_2 (a bigint);
-- First permanent schema
SET search_path = 'pgss_schema_1';
SELECT count(*) FROM tab_search_same;
count
-------
0
(1 row)
SELECT a, b FROM tab_search_same;
a | b
---+---
(0 rows)
SELECT count(*) FROM tab_search_diff_1;
count
-------
0
(1 row)
SELECT count(*) FROM tab_search_diff_2;
count
-------
0
(1 row)
SELECT a FROM tab_search_diff_2 AS t1;
a
---
(0 rows)
SELECT a FROM tab_search_diff_2;
a
---
(0 rows)
SELECT a AS a1 FROM tab_search_diff_2;
a1
----
(0 rows)
-- Second permanent schema
SET search_path = 'pgss_schema_2';
SELECT count(*) FROM tab_search_same;
count
-------
0
(1 row)
SELECT a, b FROM tab_search_same;
a | b
---+---
(0 rows)
SELECT count(*) FROM tab_search_diff_1;
count
-------
0
(1 row)
SELECT count(*) FROM tab_search_diff_2;
count
-------
0
(1 row)
SELECT a FROM tab_search_diff_2 AS t1;
a
---
(0 rows)
SELECT a FROM tab_search_diff_2;
a
---
(0 rows)
SELECT a AS a1 FROM tab_search_diff_2;
a1
----
(0 rows)
-- Temporary schema
SET search_path = 'pg_temp';
SELECT count(*) FROM tab_search_same;
count
-------
0
(1 row)
SELECT a, b FROM tab_search_same;
a | b
---+---
(0 rows)
SELECT count(*) FROM tab_search_diff_1;
count
-------
0
(1 row)
SELECT count(*) FROM tab_search_diff_2;
count
-------
0
(1 row)
SELECT a FROM tab_search_diff_2 AS t1;
a
---
(0 rows)
SELECT a FROM tab_search_diff_2;
a
---
(0 rows)
SELECT a AS a1 FROM tab_search_diff_2;
a1
----
(0 rows)
RESET search_path;
-- Schema qualifications
SELECT count(*) FROM pgss_schema_1.tab_search_same;
count
-------
0
(1 row)
SELECT a, b FROM pgss_schema_1.tab_search_same;
a | b
---+---
(0 rows)
SELECT count(*) FROM pgss_schema_2.tab_search_diff_1;
count
-------
0
(1 row)
SELECT count(*) FROM pg_temp.tab_search_diff_2;
count
-------
0
(1 row)
SELECT a FROM pgss_schema_2.tab_search_diff_2 AS t1;
a
---
(0 rows)
SELECT a FROM pgss_schema_2.tab_search_diff_2;
a
---
(0 rows)
SELECT a AS a1 FROM pgss_schema_2.tab_search_diff_2;
a1
----
(0 rows)
SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C";
calls | query
-------+------------------------------------------------------------------------
Use relation name instead of OID in query jumbling for RangeTblEntry custom_query_jumble (introduced in 5ac462e2b7ac as a node field attribute) is now assigned to the expanded reference name "eref" of RangeTblEntry, adding in the query jumble computation the non-qualified aliased relation name, without the list of column names. The relation OID is removed from the query jumbling. The effects of this change can be seen in the tests added by 3430215fe35f, where pg_stat_statements (PGSS) entries are now grouped using the relation name, ignoring the relation search_path may point at. For example, these two relations are different, but are now grouped in a single PGSS entry as they are assigned the same query ID: CREATE TABLE foo1.tab (a int); CREATE TABLE foo2.tab (b int); SET search_path = 'foo1'; SELECT count(*) FROM tab; SET search_path = 'foo2'; SELECT count(*) FROM tab; SELECT count(*) FROM foo1.tab; SELECT count(*) FROM foo2.tab; SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab'; query | calls --------------------------+------- SELECT count(*) FROM tab | 4 (1 row) It is still possible to use an alias in the FROM clause to split these. This behavior is useful for relations re-created with the same name, where queries based on such relations would be grouped in the same PGSS entry. For permanent schemas, it should not really matter in practice. The main benefit is for workloads that use a lot of temporary relations, which are usually re-created with the same name continuously. These can be a heavy source of bloat in PGSS depending on the workload. Such entries can now be grouped together, improving the user experience. The original idea from Christoph Berg used catalog lookups to find temporary relations, something that the query jumble has never done, and it could cause some performance regressions. The idea to use RangeTblEntry.eref and the relation name, applying the same rules for all relations, temporary and not temporary, has been proposed by Tom Lane. The documentation additions have been suggested by Sami Imseih. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
2025-03-26 15:21:05 +09:00
8 | SELECT a FROM tab_search_diff_2
4 | SELECT a FROM tab_search_diff_2 AS t1
4 | SELECT a, b FROM tab_search_same
0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"
Use relation name instead of OID in query jumbling for RangeTblEntry custom_query_jumble (introduced in 5ac462e2b7ac as a node field attribute) is now assigned to the expanded reference name "eref" of RangeTblEntry, adding in the query jumble computation the non-qualified aliased relation name, without the list of column names. The relation OID is removed from the query jumbling. The effects of this change can be seen in the tests added by 3430215fe35f, where pg_stat_statements (PGSS) entries are now grouped using the relation name, ignoring the relation search_path may point at. For example, these two relations are different, but are now grouped in a single PGSS entry as they are assigned the same query ID: CREATE TABLE foo1.tab (a int); CREATE TABLE foo2.tab (b int); SET search_path = 'foo1'; SELECT count(*) FROM tab; SET search_path = 'foo2'; SELECT count(*) FROM tab; SELECT count(*) FROM foo1.tab; SELECT count(*) FROM foo2.tab; SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab'; query | calls --------------------------+------- SELECT count(*) FROM tab | 4 (1 row) It is still possible to use an alias in the FROM clause to split these. This behavior is useful for relations re-created with the same name, where queries based on such relations would be grouped in the same PGSS entry. For permanent schemas, it should not really matter in practice. The main benefit is for workloads that use a lot of temporary relations, which are usually re-created with the same name continuously. These can be a heavy source of bloat in PGSS depending on the workload. Such entries can now be grouped together, improving the user experience. The original idea from Christoph Berg used catalog lookups to find temporary relations, something that the query jumble has never done, and it could cause some performance regressions. The idea to use RangeTblEntry.eref and the relation name, applying the same rules for all relations, temporary and not temporary, has been proposed by Tom Lane. The documentation additions have been suggested by Sami Imseih. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
2025-03-26 15:21:05 +09:00
4 | SELECT count(*) FROM tab_search_diff_1
4 | SELECT count(*) FROM tab_search_diff_2
Use relation name instead of OID in query jumbling for RangeTblEntry custom_query_jumble (introduced in 5ac462e2b7ac as a node field attribute) is now assigned to the expanded reference name "eref" of RangeTblEntry, adding in the query jumble computation the non-qualified aliased relation name, without the list of column names. The relation OID is removed from the query jumbling. The effects of this change can be seen in the tests added by 3430215fe35f, where pg_stat_statements (PGSS) entries are now grouped using the relation name, ignoring the relation search_path may point at. For example, these two relations are different, but are now grouped in a single PGSS entry as they are assigned the same query ID: CREATE TABLE foo1.tab (a int); CREATE TABLE foo2.tab (b int); SET search_path = 'foo1'; SELECT count(*) FROM tab; SET search_path = 'foo2'; SELECT count(*) FROM tab; SELECT count(*) FROM foo1.tab; SELECT count(*) FROM foo2.tab; SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab'; query | calls --------------------------+------- SELECT count(*) FROM tab | 4 (1 row) It is still possible to use an alias in the FROM clause to split these. This behavior is useful for relations re-created with the same name, where queries based on such relations would be grouped in the same PGSS entry. For permanent schemas, it should not really matter in practice. The main benefit is for workloads that use a lot of temporary relations, which are usually re-created with the same name continuously. These can be a heavy source of bloat in PGSS depending on the workload. Such entries can now be grouped together, improving the user experience. The original idea from Christoph Berg used catalog lookups to find temporary relations, something that the query jumble has never done, and it could cause some performance regressions. The idea to use RangeTblEntry.eref and the relation name, applying the same rules for all relations, temporary and not temporary, has been proposed by Tom Lane. The documentation additions have been suggested by Sami Imseih. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
2025-03-26 15:21:05 +09:00
4 | SELECT count(*) FROM tab_search_same
1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
Use relation name instead of OID in query jumbling for RangeTblEntry custom_query_jumble (introduced in 5ac462e2b7ac as a node field attribute) is now assigned to the expanded reference name "eref" of RangeTblEntry, adding in the query jumble computation the non-qualified aliased relation name, without the list of column names. The relation OID is removed from the query jumbling. The effects of this change can be seen in the tests added by 3430215fe35f, where pg_stat_statements (PGSS) entries are now grouped using the relation name, ignoring the relation search_path may point at. For example, these two relations are different, but are now grouped in a single PGSS entry as they are assigned the same query ID: CREATE TABLE foo1.tab (a int); CREATE TABLE foo2.tab (b int); SET search_path = 'foo1'; SELECT count(*) FROM tab; SET search_path = 'foo2'; SELECT count(*) FROM tab; SELECT count(*) FROM foo1.tab; SELECT count(*) FROM foo2.tab; SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab'; query | calls --------------------------+------- SELECT count(*) FROM tab | 4 (1 row) It is still possible to use an alias in the FROM clause to split these. This behavior is useful for relations re-created with the same name, where queries based on such relations would be grouped in the same PGSS entry. For permanent schemas, it should not really matter in practice. The main benefit is for workloads that use a lot of temporary relations, which are usually re-created with the same name continuously. These can be a heavy source of bloat in PGSS depending on the workload. Such entries can now be grouped together, improving the user experience. The original idea from Christoph Berg used catalog lookups to find temporary relations, something that the query jumble has never done, and it could cause some performance regressions. The idea to use RangeTblEntry.eref and the relation name, applying the same rules for all relations, temporary and not temporary, has been proposed by Tom Lane. The documentation additions have been suggested by Sami Imseih. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
2025-03-26 15:21:05 +09:00
(8 rows)
DROP SCHEMA pgss_schema_1 CASCADE;
NOTICE: drop cascades to 3 other objects
DETAIL: drop cascades to table pgss_schema_1.tab_search_same
drop cascades to table pgss_schema_1.tab_search_diff_1
drop cascades to table pgss_schema_1.tab_search_diff_2
DROP SCHEMA pgss_schema_2 CASCADE;
NOTICE: drop cascades to 3 other objects
DETAIL: drop cascades to table pgss_schema_2.tab_search_same
drop cascades to table pgss_schema_2.tab_search_diff_1
drop cascades to table pgss_schema_2.tab_search_diff_2
DROP TABLE tab_search_same, tab_search_diff_1, tab_search_diff_2;
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
t
---
t
(1 row)