Teach nbtree's _bt_killitems to leave the so->currPos page that it sets
LP_DEAD items on in whatever state it was in when _bt_killitems was
called. In particular, make sure that so->dropPin scans don't acquire a
pin whose reference is saved in so->currPos.buf.
Allowing _bt_killitems to change so->currPos.buf like this is wrong.
The immediate consequence of allowing it is that code in _bt_steppage
(that copies so->currPos into so->markPos) will behave as if the scan is
a !so->dropPin scan. so->markPos will therefore retain the buffer pin
indefinitely, even though _bt_killitems only needs to acquire a pin
(along with a lock) for long enough to mark known-dead items LP_DEAD.
This issue came to light following a report of a failure of an assertion
from recent commit e6eed40e. The test case in question involves the use
of mark and restore. An initial call to _bt_killitems takes place that
leaves so->currPos.buf in a state that is inconsistent with the scan
being so->dropPin. A subsequent call to _bt_killitems for the same
position (following so->currPos being saved in so->markPos, and then
restored as so->currPos) resulted in the failure of an assertion that
tests that so->currPos.buf is InvalidBuffer when the scan is so->dropPin
(non-assert builds got a "resource was not closed" WARNING instead).
The same problem exists on earlier releases, though the issue is far
more subtle there. Recent commit e6eed40e introduced the so->dropPin
field as a partial replacement for testing so->currPos.buf directly.
Earlier releases won't get an assertion failure (or buffer pin leak),
but they will allow the second _bt_killitems call from the test case to
behave as if a buffer pin was consistently held since the original call
to _bt_readpage. This is wrong; there will have been an initial window
during which no pin was held on the so->currPos page, and yet the second
_bt_killitems call will neglect to check if so->currPos.lsn continues to
match the page's now-current LSN.
As a result of all this, it's just about possible that _bt_killitems
will set the wrong items LP_DEAD (on release branches). This could only
happen with merge joins (the sole user of nbtree mark/restore support),
when a concurrently inserted index tuple used a recently-recycled TID
(and only when the new tuple was inserted onto the same page as a
distinct concurrently-removed tuple with the same TID). This is exactly
the scenario that _bt_killitems' check of the page's now-current LSN
against the LSN stashed in currPos was supposed to prevent.
A follow-up commit will make nbtree completely stop conditioning whether
or not a position's pin needs to be dropped on whether the 'buf' field
is set. All call sites that might need to drop a still-held pin will be
taught to rely on the scan-level so->dropPin field recently introduced
by commit e6eed40e. That will make bugs of the same general nature as
this one impossible (or make them much easier to detect, at least).
Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/545be1e5-3786-439a-9257-a90d30f8b849@gmail.com
Backpatch-through: 13
Traditionally, libpq's pqPutMsgEnd has rounded down the amount-to-send
to be a multiple of 8K when it is eagerly writing some data. This
still seems like a good idea when sending through a Unix socket, as
pipes typically have a buffer size of 8K or some fraction/multiple of
that. But there's not much argument for it on a TCP connection, since
(a) standard MTU values are not commensurate with that, and (b) the
kernel typically applies its own packet splitting/merging logic.
Worse, our SSL and GSSAPI code paths both have API stipulations that
if they fail to send all the data that was offered in the previous
write attempt, we mustn't offer less data in the next attempt; else
we may get "SSL error: bad length" or "GSSAPI caller failed to
retransmit all data needing to be retried". The previous write
attempt might've been pqFlush attempting to send everything in the
buffer, so pqPutMsgEnd can't safely write less than the full buffer
contents. (Well, we could add some more state to track exactly how
much the previous write attempt was, but there's little value evident
in such extra complication.) Hence, apply the round-down only on
AF_UNIX sockets, where we never use SSL or GSSAPI.
Interestingly, we had a very closely related bug report before,
which I attempted to fix in commit d053a879b. But the test case
we had then seemingly didn't trigger this pqFlush-then-pqPutMsgEnd
scenario, or at least we failed to recognize this variant of the bug.
Bug: #18907
Reported-by: Dorjpalam Batbaatar <htgn.dbat.95@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18907-d41b9bcf6f29edda@postgresql.org
Backpatch-through: 13
The documentation for the pg_authid system catalog and the
pg_shadow system view indicates that passwords might be stored in
cleartext, but that hasn't been possible for some time.
Oversight in commit eb61136dc7.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aD2yKkZro4nbl5ol%40nathan
Backpatch-through: 13
We store values for these options as array elements with the syntax
"name=value", hence a name containing "=" confuses matters when
it's time to read the array back in. Since validation of the
options is often done (long) after this conversion to array format,
that leads to confusing and off-point error messages. We can
improve matters by rejecting names containing "=" up-front.
(Probably a better design would have involved pairs of array
elements, but it's too late now --- and anyway, there's no
evident use-case for option names like this. We already
reject such names in some other contexts such as GUCs.)
Reported-by: Chapman Flack <jcflack@acm.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Chapman Flack <jcflack@acm.org>
Discussion: https://postgr.es/m/6830EB30.8090904@acm.org
Backpatch-through: 13
PLy_elog_impl and its subroutine PLy_traceback intended to avoid
leaking any PyObject reference counts, but their coverage of the
matter was sadly incomplete. In particular, out-of-memory errors
in most of the string-construction subroutines could lead to
reference count leaks, because those calls were outside the
PG_TRY blocks responsible for dropping reference counts.
Fix by (a) adjusting the scopes of the PG_TRY blocks, and
(b) moving the responsibility for releasing the reference counts
of the traceback-stack objects to PLy_elog_impl. This requires
some additional "volatile" markers, but not too many.
In passing, fix an ancient thinko: use of the "e_module_o" PyObject
was guarded by "if (e_type_s)", where surely "if (e_module_o)"
was meant. This would only have visible consequences if the
"__name__" attribute were present but the "__module__" attribute
wasn't, which apparently never happens; but someday it might.
Rearranging the PG_TRY blocks requires indenting a fair amount
of code one more tab stop, which I'll do separately for clarity.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2954090.1748723636@sss.pgh.pa.us
Backpatch-through: 13
A few places that access system catalogs don't set up an active
snapshot before potentially accessing their TOAST tables. To fix,
push an active snapshot just before each section of code that might
require accessing one of these TOAST tables, and pop it shortly
afterwards. While at it, this commit adds some rather strict
assertions in an attempt to prevent such issues in the future.
Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST
table in order to fix the same problem for that catalog. On the
back-branches, those bugs are left in place. We cannot easily
remove a catalog's TOAST table on released major versions, and only
replication origins with extremely long names are affected. Given
the low severity of the issue, fixing older versions doesn't seem
worth the trouble of significantly modifying the patch.
Also, on v13 and v14, the aforementioned strict assertions have
been omitted because commit 2776922201, which added
HaveRegisteredOrActiveSnapshot(), was not back-patched. While we
could probably back-patch it now, I've opted against it because it
seems unlikely that new TOAST snapshot issues will be introduced in
the oldest supported versions.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 13
postgres_fdw tries to use PG_TRY blocks to ensure that it will
eventually free the PGresult created by the remote modify command.
However, it's fundamentally impossible for this scheme to work
reliably when there's RETURNING data, because the query could fail
in between invocations of postgres_fdw's DirectModify methods.
There is at least one instance of exactly this situation in the
regression tests, and the ensuing session-lifespan leak is visible
under Valgrind.
We can improve matters by using a memory context reset callback
attached to the ExecutorState context. That ensures that the
PGresult will be freed when the ExecutorState context is torn
down, even if control never reaches postgresEndDirectModify.
I have little faith that there aren't other potential PGresult
leakages in the backend modules that use libpq. So I think it'd
be a good idea to apply this concept universally by creating
infrastructure that attaches a reset callback to every PGresult
generated in the backend. However, that seems too invasive for
v18 at this point, let alone the back branches. So for the
moment, apply this narrow fix that just makes DirectModify safe.
I have a patch in the queue for the more general idea, but it
will have to wait for v19.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/2976982.1748049023@sss.pgh.pa.us
Backpatch-through: 13
Our GSSAPI code only allows packet sizes up to 16kB. However it
emerges that during authentication, larger packets might be needed;
various authorities suggest 48kB or 64kB as the maximum packet size.
This limitation caused login failure for AD users who belong to many
AD groups. To add insult to injury, we gave an unintelligible error
message, typically "GSSAPI context establishment error: The routine
must be called again to complete its function: Unknown error".
As noted in code comments, the 16kB packet limit is effectively a
protocol constant once we are doing normal data transmission: the
GSSAPI code splits the data stream at those points, and if we change
the limit then we will have cross-version compatibility problems
due to the receiver's buffer being too small in some combinations.
However, during the authentication exchange the packet sizes are
not determined by us, but by the underlying GSSAPI library. So we
might as well just try to send what the library tells us to.
An unpatched recipient will fail on a packet larger than 16kB,
but that's not worse than the sender failing without even trying.
So this doesn't introduce any meaningful compatibility problem.
We still need a buffer size limit, but we can easily make it be
64kB rather than 16kB until transport negotiation is complete.
(Larger values were discussed, but don't seem likely to add
anything.)
Reported-by: Chris Gooch <cgooch@bamfunds.com>
Fix-suggested-by: Jacob Champion <jacob.champion@enterprisedb.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/DS0PR22MB5971A9C8A3F44BCC6293C4DABE99A@DS0PR22MB5971.namprd22.prod.outlook.com
Backpatch-through: 13
Previously, XactLockTableWait() and ConditionalXactLockTableWait() could enter
a non-interruptible loop when they successfully acquired a lock on a transaction
but the transaction still appeared to be running. Since this loop continued
until the transaction completed, it could result in long, uninterruptible waits.
Although this scenario is generally unlikely since XactLockTableWait() and
ConditionalXactLockTableWait() can basically acquire a transaction lock
only when the transaction is not running, it can occur in a hot standby.
In such cases, the transaction may still appear active due to
the KnownAssignedXids list, even while no lock on the transaction exists.
For example, this situation can happen when creating a logical replication
slot on a standby.
The cause of the non-interruptible loop was the absence of CHECK_FOR_INTERRUPTS()
within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both functions,
ensuring they can be interrupted safely.
Back-patch to all supported branches.
Author: Kevin K Biju <kevinkbiju@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAM45KeELdjhS-rGuvN=ZLJ_asvZACucZ9LZWVzH7bGcD12DDwg@mail.gmail.com
Backpatch-through: 13
Commits 53af9491a043 and 2d5fe514052a fixed a number of problems with
foreign keys that reference partitioned tables, and a query to detect
already broken FKs was supplied with the release notes for 17.1, 16.5,
15.9, 14.14, 13.17. However, that query has a bug that causes it to
wrongly report self-referential foreign keys even when they are correct,
so if the user was to drop and rebuild the FKs as indicated, the query
would continue to report them as needing to be repaired. Here we fix
the query to not have that problem.
Reported-by: Paul Foerster <paul.foerster@gmail.com>
Discussion: https://postgr.es/m/5456A1D0-CD47-4315-9C65-71B27E7A2906@gmail.com
Backpatch-through: 13-17
If we hit out-of-memory between creating the PGconn and inserting
it into dblink's hashtable, we'd lose track of the PGconn, which
is quite bad since it represents a live connection to a remote DB.
Fix by rearranging things so that we create the hashtable entry
first.
Also reduce the number of states we have to deal with by getting rid
of the separately-allocated remoteConn object, instead allocating it
in-line in the hashtable entries. (That incidentally removes a
session-lifespan memory leak observed in the regression tests.)
There is an apparently-irreducible remaining OOM hazard, which
is that if the connection fails at the libpq level (ie it's
CONNECTION_BAD) then we have to pstrdup the PGconn's error message
before we can release it, and theoretically that could fail. However,
in such cases we're only leaking memory not a live remote connection,
so I'm not convinced that it's worth sweating over.
This is a pretty low-probability failure mode of course, but losing
a live connection seems bad enough to justify back-patching.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/1346940.1748381911@sss.pgh.pa.us
Backpatch-through: 13
pg_stat_statements anticipates that certain constant locations may be
recorded multiple times and attempts to avoid calculating a length for
these locations in fill_in_constant_lengths().
However, during generate_normalized_query() where normalized query
strings are generated, these locations are not excluded from
consideration. This could increment the parameter number counter for
every recorded occurrence at such a location, leading to an incorrect
normalization in certain cases with gaps in the numbers reported.
For example, take this query:
SELECT WHERE '1' IN ('2'::int, '3'::int::text)
Before this commit, it would be normalized like that, with gaps in the
parameter numbers:
SELECT WHERE $1 IN ($3::int, $4::int::text)
However the correct, less confusing one should be like that:
SELECT WHERE $1 IN ($2::int, $3::int::text)
This commit fixes the computation of the parameter numbers to track the
number of constants replaced with an $n by a separate counter instead of
the iterator used to loop through the list of locations.
The underlying query IDs are not changed, neither are the normalized
strings for existing PGSS hash entries. New entries with fresh
normalized queries would automatically get reshaped based on the new
parameter numbering.
Issue discovered while discussing a separate problem for HEAD, but this
affects all the stable branches.
Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0tzxvWXsacGyxrixdhy3tTTDfJQqxyFBRFh31nNHBQ5qA@mail.gmail.com
Backpatch-through: 13
As written, the test was throwing an error because of an unbalanced
parenthesis. The regex used in the test is adjusted to not fail and to
test the case of an opening parenthesis in a character class after some
nested square brackets.
Oversight in d46911e584d4.
Discussion: https://postgr.es/m/16ab039d1af455652bdf4173402ddda145f2c73b.camel@cybertec.at
The code that translates SIMILAR TO pattern matching expressions to
POSIX-style regular expressions did not consider that square brackets
can be nested. For example, in an expression like [[:alpha:]%_], the
logic replaced the placeholders '_' and '%' but it should not.
This commit fixes the conversion logic by tracking the nesting level of
square brackets marking character class areas, while considering that
in expressions like []] or [^]] the first closing square bracket is a
regular character. Multiple tests are added to show how the conversions
should or should not apply applied while in a character class area, with
specific cases added for all the characters converted outside character
classes like an opening parenthesis '(', dollar sign '$', etc.
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/16ab039d1af455652bdf4173402ddda145f2c73b.camel@cybertec.at
Backpatch-through: 13
The documentation for exported snapshots in logical decoding previously
stated that snapshot creation may fail on a hot standby. This is no longer
accurate, as snapshot exporting on standbys has been supported since
PostgreSQL 10. This commit removes the outdated description.
Additionally, the docs referred to the NOEXPORT_SNAPSHOT option to
suppress snapshot exporting in CREATE_REPLICATION_SLOT. However,
since PostgreSQL 15, NOEXPORT_SNAPSHOT is considered legacy syntax
and retained only for backward compatibility. This commit updates
the documentation for v15 and later to use the modern equivalent:
SNAPSHOT 'nothing'. The older syntax is preserved in documentation for
v14 and earlier.
Back-patched to all supported branches.
Reported-by: Kevin K Biju <kevinkbiju@gmail.com>
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Kevin K Biju <kevinkbiju@gmail.com>
Discussion: https://postgr.es/m/174791480466.798.17122832105389395178@wrigleys.postgresql.org
Backpatch-through: 13
Invalid indexes are suffixed with "_ccnew" or "_ccold". The
documentation missed to mention the initial underscore.
ChooseRelationName() may also append an extra number if indexes with a
similar name already exist; let's add a note about that too.
Author: Alec Cozens <acozens@pixelpower.com>
Discussion: https://postgr.es/m/174733277404.1455388.11471370288789479593@wrigleys.postgresql.org
Backpatch-through: 13
In the grammar, <expr> is a c_expr, which accepts only a limited set
of integer literals and simple expressions without parens. The
deparsing logic didn't quite match the grammar rule, and failed to use
parens e.g. for "5::bigint".
To fix, always surround the expression with parens. Would be nice to
omit the parens in simple cases, but unfortunately it's non-trivial to
detect such simple cases. Even if the expression is a simple literal
123 in the original query, after parse analysis it becomes a FuncExpr
with COERCE_IMPLICIT_CAST rather than a simple Const.
Reported-by: yonghao lee
Backpatch-through: 13
Discussion: https://www.postgresql.org/message-id/18929-077d6b7093b176e2@postgresql.org
Prevent moving the confirmed_flush backwards, as this could lead to data
duplication issues caused by replicating already replicated changes.
This can happen when a client acknowledges an LSN it doesn't have to do
anything for, and thus didn't store persistently. After a restart, the
client can send the prior LSN that it stored persistently as an
acknowledgement, but we need to ignore such an LSN to avoid retreating
confirm_flush LSN.
Diagnosed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Author: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Tested-by: Nisha Moond <nisha.moond412@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/CAJpy0uDZ29P=BYB1JDWMCh-6wXaNqMwG1u1mB4=10Ly0x7HhwQ@mail.gmail.com
Discussion: https://postgr.es/m/OS0PR01MB57164AB5716AF2E477D53F6F9489A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Per the letter of the C11 standard, one must #define
__STDC_WANT_LIB_EXT1__ as 1 before including <string.h> in order to
have access to memset_s(). It appears that many platforms are lenient
about this, because we weren't doing it and yet the code appeared to
work anyway. But we now find that with -std=c11, macOS is strict and
doesn't declare memset_s, leading to compile failures since we try to
use it anyway. (Given the lack of prior reports, perhaps this is new
behavior in the latest SDK? No matter, we're clearly in the wrong.)
In addition to the immediate problem, which could be fixed merely by
adding the needed #define to explicit_bzero.c, it seems possible that
our configure-time probe for memset_s() could fail in case a platform
implements the function in some odd way due to this spec requirement.
This concern can be fixed in largely the same way that we dealt with
strchrnul() in 6da2ba1d8: switch to using a declaration-based
configure probe instead of a does-it-link probe.
Back-patch to v13 where we started using memset_s().
Reported-by: Lakshmi Narayana Velayudam <dev.narayana.v@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAA4pTnLcKGG78xeOjiBr5yS7ZeE-Rh=FaFQQGOO=nPzA1L8yEA@mail.gmail.com
Backpatch-through: 13
This aligns the copyright and legal notice wordig with commit
a233a603bab8 and pgweb commit 2d764dbc083ab8. Backpatch down
to all supported versions.
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Dave Page <dpage@pgadmin.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/744E414E-3F52-404C-97FB-ED9B3AA37DC8@yesql.se
Backpatch-through: 13
The documentation for log_check() had the parameters in the wrong
order. Also while there, rename %parameters to %params to better
documentation for similar functions which use %params. Backpatch
down to v14 where this was introduced.
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/9F503B5-32F2-45D7-A0AE-952879AD65F1@yesql.se
Backpatch-through: 14
With GB18030 as source encoding, applications could crash the server via
SQL functions convert() or convert_from(). Applications themselves
could crash after passing unterminated GB18030 input to libpq functions
PQescapeLiteral(), PQescapeIdentifier(), PQescapeStringConn(), or
PQescapeString(). Extension code could crash by passing unterminated
GB18030 input to jsonapi.h functions. All those functions have been
intended to handle untrusted, unterminated input safely.
A crash required allocating the input such that the last byte of the
allocation was the last byte of a virtual memory page. Some malloc()
implementations take measures against that, making the SIGSEGV hard to
reach. Back-patch to v13 (all supported versions).
Author: Noah Misch <noah@leadboat.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Backpatch-through: 13
Security: CVE-2025-4207
Start the file with static functions not specific to pe_test_vectors
tests. This way, new tests can use them without disrupting the file's
layout. Change report_result() PQExpBuffer arguments to plain strings.
Back-patch to v13 (all supported versions), for the next commit.
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Backpatch-through: 13
Security: CVE-2025-4207
For self-referencing foreign keys in partitioned tables, we weren't
handling creation of pg_constraint rows during CREATE TABLE PARTITION AS
as well as ALTER TABLE ATTACH PARTITION. This is an old bug -- mostly,
we broke this in 614a406b4ff1 while trying to fix it (so 12.13, 13.9,
14.6 and 15.0 and up all behave incorrectly). This commit reverts part
of that with additional fixes for full correctness, and installs more
tests to verify the parts we broke, not just the catalog contents but
also the user-visible behavior.
Backpatch to all live branches. In branches 13 and 14, commit
46a8c27a7226 changed the behavior during DETACH to drop a FK
constraint rather than trying to repair it, because the complete fix of
repairing catalog constraints was problematic due to lack of previous
fixes. For this reason, the test behavior in those branches is a bit
different. However, as best as I can tell, the fix works correctly
there.
In release notes we have to recommend that all self-referencing foreign
keys on partitioned tables be recreated if partitions have been created
or attached after the FK was created, keeping in mind that violating
rows might already be present on the referencing side.
Reported-by: Guillaume Lelarge <guillaume@lelarge.info>
Reported-by: Matthew Gabeler-Lee <fastcat@gmail.com>
Reported-by: Luca Vallisa <luca.vallisa@gmail.com>
Discussion: https://postgr.es/m/CAECtzeWHCA+6tTcm2Oh2+g7fURUJpLZb-=pRXgeWJ-Pi+VU=_w@mail.gmail.com
Discussion: https://postgr.es/m/18156-a44bc7096f0683e6@postgresql.org
Discussion: https://postgr.es/m/CAAT=myvsiF-Attja5DcWoUWh21R12R-sfXECY2-3ynt8kaOqjw@mail.gmail.com
SQL "SET search_path = 'pg_catalog, pg_temp'" is silently equivalent to
"SET search_path = pg_temp, pg_catalog, "pg_catalog, pg_temp"" instead
of the intended "SET search_path = pg_catalog, pg_temp". (The intent
was a two-element search path. With the single quotes, it instead
specifies one element with a comma and a space in the middle of the
element.) In addition to the SET statement, this affects SET clauses of
CREATE FUNCTION, ALTER ROLE, and ALTER DATABASE. It does not affect the
set_config() SQL function.
Though the documentation did not show an insecure command, remove single
quotes that could entice a reader to write an insecure command.
Back-patch to v13 (all supported versions).
Reported-by: Sven Klemm <sven@timescale.com>
Author: Sven Klemm <sven@timescale.com>
Backpatch-through: 13
While back-patching 9f45e6a91, I neglected that the convention in
pre-v16 libpq was to include a trailing newline in error message
strings (since then, we add those separately). Add them now.
Reported-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/a9c837ad-d507-4607-94e4-c5743a8f49e0@eisentraut.org
Backpatch-through: 13-15
Add a documentation warning to ts_headline() pointing out that, when
working with untrusted input documents, the output is not guaranteed
to be safe for direct inclusion in web pages. This is because, while
it does remove some XML tags from the input, it doesn't remove all
HTML markup, and so the result may be unsafe (e.g., it might permit
XSS attacks).
To guard against that, all HTML markup should be removed from the
input, making it plain text, or the output should be passed through an
HTML sanitizer.
In addition, document precisely what the default text search parser
recognises as valid XML tags, since that's what determines which XML
tags ts_headline() will remove.
Reported-by: Richard Neill <richard.neill@telos.digital>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Backpatch-through: 13
Clear any potential stale next_phase_at value from the snapshot
builder which otherwise may trip an assertion check ensuring
that there is no next_phase_at value.
This can be reproduced by running 80 concurrent sessions like
the below where $c is a loop counter (assumes there has been
1..$c databases created) :
echo "
CREATE TABLE replication_example(id SERIAL PRIMARY KEY,
somedata int,
text varchar(120));
SELECT 'init' FROM
pg_create_logical_replication_slot('regression_slot_$c',
'test_decoding');
SELECT data FROM
pg_logical_slot_get_changes('regression_slot_$c', NULL,
NULL, 'include-xids', '0',
'skip-empty-xacts', '1');
" | psql -d regress_$c >>psql.log &
This was originally committed as 48efb23 and backpatched down to
v16, but since then there have been reports of this happening on
v14 and v15 as well so this is a backpatch of 48efb23 down to 14.
Bug: #17695
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reported-by: bowenshi <zxwsbg@qq.com>
Reported-by: Alexander Pyhalov <a.pyhalov@postgrespro.ru>
Reported-by: Teja Mupparti
Discussion: https://postgr.es/m/17695-6be9277c9295985f@postgresql.org
Backpatch-through: v14
DST law changes in Chile: there is a new time zone America/Coyhaique
for Chile's Aysén Region, to account for it changing to UTC-03
year-round and thus diverging from America/Santiago.
Historical corrections for Iran.
Backpatch-through: 13
During logical decoding, we advance catalog_xmin of logical too early in
fast_forward mode, resulting in required catalog data being removed by
vacuum. This mode is normally used to advance the slot without processing
the changes, but we still can't let the slot's xmin to advance to an
incorrect value.
Commit f49a80c481 fixed a similar issue where the logical slot's
catalog_xmin was getting advanced prematurely during non-fast-forward
mode. During xl_running_xacts processing, instead of directly advancing
the slot's xmin to the oldest running xid in the record, it allowed the
xmin to be held back for snapshots that can be used for
not-yet-replayed transactions, as those might consider older txns as
running too. However, it missed the fact that the same problem can happen
during fast_forward mode decoding, as we won't build a base snapshot in
that mode, and the future call to get_changes from the same slot can miss
seeing the required catalog changes leading to incorrect reslts.
This commit allows building the base snapshot even in fast_forward mode to
prevent the early advancement of xmin.
Reported-by: Amit Kapila <amit.kapila16@gmail.com>
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/CAA4eK1LqWncUOqKijiafe+Ypt1gQAQRjctKLMY953J79xDBgAg@mail.gmail.com
Discussion: https://postgr.es/m/OS0PR01MB57163087F86621D44D9A72BF94BB2@OS0PR01MB5716.jpnprd01.prod.outlook.com
v14 commit 1f95181b44c843729caaa688f74babe9403b5850 and its v13
equivalent caused timing-dependent failures in archive recovery, at
restartpoints. The symptom was "invalid magic number 0000 in log
segment X, offset 0", "unexpected pageaddr X in log segment Y, offset 0"
[X < Y], or an assertion failure. Commit
3635a0a35aafd3bfa80b7a809bc6e91ccd36606a and predecessors back-patched
v15 changes to fix that. This test reproduces the problem
probabilistically, typically in less than 1000 iterations of the test.
Hence, buildfarm and CI runs would have surfaced enough failures to get
attention within a day.
Reported-by: Arun Thirupathi <arunth@google.com>
Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com
Backpatch-through: 13
Commit 7102070329d8147246d2791321f9915c3b5abf31 fixed a similar bug, but
it missed the case of database-wide ANALYZE ("use_own_xacts" mode).
Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 changed consequences
from silent discard of a pg_class stats (relpages et al.) update to
ERROR "tuple to be updated was already modified". Losing a relpages
update of an ON COMMIT DELETE ROWS table was negligible, but a
COMMIT-time error isn't negligible. Back-patch to v13 (all supported
versions).
Reported-by: Richard Guo <guofenglinux@gmail.com
Reported-by: Robins Tharakan <tharakan@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-XwMKMKJ_GT=p3_-_=j9rQSEs1FbDFUnW9zHuKPsPNEQ@mail.gmail.com
Backpatch-through: 13
The original intent in heap_page_items() was to return nulls, not
throw an error or crash, if an item was sufficiently corrupt that
we couldn't safely extract data from it. However, commit d6061f83a
utterly missed that memo, and not only put in an un-length-checked
copy of the tuple's data section, but also managed to break the check
on sane nulls-bitmap length. Either mistake could possibly lead to
a SIGSEGV crash if the tuple is corrupt.
Bug: #18896
Reported-by: Dmitry Kovalenko <d.kovalenko@postgrespro.ru>
Author: Dmitry Kovalenko <d.kovalenko@postgrespro.ru>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18896-add267b8e06663e3@postgresql.org
Backpatch-through: 13
We'd try to drop the partitions of a partitioned index separately,
which is disallowed by the backend, leading to an error during
restore. While the error is harmless, it causes problems if you
try to use --single-transaction mode.
Fortunately, there seems no need to do a DROP at all, since the
partition will go away silently when we drop either the parent index
or the partition's table. So just make the DROP conditional on not
being a partition.
Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxF0QSdkjFKF4di-JGWN6CSdQYEAhGPmQJJCdkSZtd=oLg@mail.gmail.com
Backpatch-through: 13
If a GENERATED column is declared to have a domain data type where
the domain's constraints disallow null values, INSERT commands failed
because we built a targetlist that included coercing a null constant
to the domain's type. The failure occurred even when the generated
value would have been perfectly OK. This is adjacent to the issues
fixed in 0da39aa76, but we didn't notice for lack of testing a domain
with such a constraint.
We aren't going to use the result of the targetlist entry for the
generated column --- ExecComputeStoredGenerated will overwrite it.
So it's not really necessary that it have the exact datatype of
the generated column. This patch fixes the problem by changing
the targetlist entry to be a null Const of the domain's base type,
which should be sufficiently legal. (We do have to tweak
ExecCheckPlanOutput to accept the situation, though.)
This has been broken since we implemented generated columns.
However, this patch only applies easily as far back as v14, partly
because I (tgl) only carried 0da39aa76 back that far, but mostly
because v14 significantly refactored the handling of INSERT/UPDATE
targetlists. Given the lack of field complaints and the short
remaining support lifetime of v13, I judge the cost-benefit ratio
not good for devising a version that would work in v13.
Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxG59tip2+9h=rEv-ykOFjt0cbsPVchhi0RTij8bABBA0Q@mail.gmail.com
Backpatch-through: 14
Commit 0f21db36d made an assumption that GIN triConsistentFns
would not modify their input entryRes[] arrays. But in fact,
the "shim" triConsistentFn that we use for opclasses that don't
supply their own did exactly that, potentially leading to wrong
answers from a GIN index search. Through bad luck, none of the
test cases that we have for such opclasses exposed the bug.
One response to this could be that the assumption of consistency check
functions not modifying entryRes[] arrays is a bad one, but it still
seems reasonable to me. Notably, shimTriConsistentFn is itself
assuming that with respect to the underlying boolean consistentFn,
so it's sure being self-centered in supposing that it gets to do so.
Fortunately, it's quite simple to fix shimTriConsistentFn to restore
the entry-time state of entryRes[], so let's do that instead.
This issue doesn't affect any core GIN opclasses, since they all
supply their own triConsistentFns. It does affect contrib modules
btree_gin, hstore, and intarray.
Along the way, I (tgl) noticed that shimTriConsistentFn failed to
pick up on a "recheck" flag returned by its first call to the boolean
consistentFn. This may be only a latent problem, since it would be
unlikely for a consistentFn to set recheck for the all-false case
and not any other cases. (Indeed, none of our contrib modules do
that.) Nonetheless, it's formally wrong.
Reported-by: Vinod Sridharan <vsridh90@gmail.com>
Author: Vinod Sridharan <vsridh90@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFMdLD7XzsXfi1+DpTqTgrD8XU0i2C99KuF=5VHLWjx4C1pkcg@mail.gmail.com
Backpatch-through: 13
synchronous_standby_names cannot be reloaded safely by backends, and the
checkpointer is in charge of updating a state in shared memory if the
GUC is enabled in WalSndCtl, to let the backends know if they should
wait or not for a given LSN. This provides a strict control on the
timing of the waiting queues if the GUC is enabled or disabled, then
reloaded. The checkpointer is also in charge of waking up the backends
that could be waiting for a LSN when the GUC is disabled.
This logic had a race condition at startup, where it would be possible
for backends to not wait for a LSN even if synchronous_standby_names is
enabled. This would cause visibility issues with transactions that we
should be waiting for but they were not. The problem lasts until the
checkpointer does its initial update of the shared memory state when it
loads synchronous_standby_names.
In order to take care of this problem, the shared memory state in
WalSndCtl is extended to detect if it has been initialized by the
checkpointer, and not only check if synchronous_standby_names is
defined. In WalSndCtlData, sync_standbys_defined is renamed to
sync_standbys_status, a bits8 able to know about two states:
- If the shared memory state has been initialized. This flag is set by
the checkpointer at startup once, and never removed.
- If synchronous_standby_names is known as defined in the shared memory
state. This is the same as the previous sync_standbys_defined in
WalSndCtl.
This method gives a way for backends to decide what they should do until
the shared memory area is initialized, and they now ultimately fall back
to a check on the GUC value in this case, which is the best thing that
can be done.
Fortunately, SyncRepUpdateSyncStandbysDefined() is called immediately by
the checkpointer when this process starts, so the window is very narrow.
It is possible to enlarge the problematic window by making the
checkpointer wait at the beginning of SyncRepUpdateSyncStandbysDefined()
with a hardcoded sleep for example, and doing so has showed that a 2PC
visibility test is indeed failing. On machines slow enough, this bug
would cause spurious failures.
In 17~, we have looked at the possibility of adding an injection point
to have a reproducible test, but as the problematic window happens at
early startup, we would need to invent a way to make an injection point
optionally persistent across restarts when attached, something that
would be fine for this case as it would involve the checkpointer. This
issue is quite old, and can be reproduced on all the stable branches.
Author: Melnikov Maksim <m.melnikov@postgrespro.ru>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/163fcbec-900b-4b07-beaa-d2ead8634bec@postgrespro.ru
Backpatch-through: 13
It's been twenty years since we generated constraint names that
look like "$N". So this advice about double-quoting such names
is well past its sell-by date, and now it merely seems confusing.
Reported-by: Yaroslav Saburov <y.saburov@gmail.com>
Author: "David G. Johnston" <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/174393459040.678.17810152410419444783@wrigleys.postgresql.org
Backpatch-through: 13
Data loss can happen when the DDLs like ALTER PUBLICATION ... ADD TABLE ...
or ALTER TYPE ... that don't take a strong lock on table happens
concurrently to DMLs on the tables involved in the DDL. This happens
because logical decoding doesn't distribute invalidations to concurrent
transactions and those transactions use stale cache data to decode the
changes. The problem becomes bigger because we keep using the stale cache
even after those in-progress transactions are finished and skip the
changes required to be sent to the client.
This commit fixes the issue by distributing invalidation messages from
catalog-modifying transactions to all concurrent in-progress transactions.
This allows the necessary rebuild of the catalog cache when decoding new
changes after concurrent DDL.
We observed performance regression primarily during frequent execution of
*publication DDL* statements that modify the published tables. The
regression is minor or nearly nonexistent for DDLs that do not affect the
published tables or occur infrequently, making this a worthwhile cost to
resolve a longstanding data loss issue.
An alternative approach considered was to take a strong lock on each
affected table during publication modification. However, this would only
address issues related to publication DDLs (but not the ALTER TYPE ...)
and require locking every relation in the database for publications
created as FOR ALL TABLES, which is impractical.
The bug exists in all supported branches, but we are backpatching till 14.
The fix for 13 requires somewhat bigger changes than this fix, so the fix
for that branch is still under discussion.
Reported-by: hubert depesz lubaczewski <depesz@depesz.com>
Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Tested-by: Benoit Lobréau <benoit.lobreau@dalibo.com>
Backpatch-through: 14
Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca74c6@enterprisedb.com
Discussion: https://postgr.es/m/CAD21AoAenVqiMjpN-PvGHL1N9DWnHSq673bfgr6phmBUzx=kLQ@mail.gmail.com
Clarify the project naming in the history section of the docs
to match the recent license preamble changes.
Backpatch to all supported versions.
Author: Dave Page <dpage@pgadmin.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA+OCxozLzK2+Jc14XZyWXSp6L9Ot+3efwXUE35FJG=fsbib2EA@mail.gmail.com
Backpatch-through: 13