23562 Commits

Author SHA1 Message Date
Aurelien DARRAGON
2021072391 MINOR: hlua_fcn: implement index and pair metamethods for patref class
patref object may now leverage index and pair methamethods to list and
access patref elements at a specific index (=key)

Also, patref:is_map() method may be used to know if the patref stores acl
(key only) or map-style (key:value) patterns.
2024-11-29 07:22:46 +01:00
Aurelien DARRAGON
31784efad2 MINOR: hlua: add core.get_patref method
core.get_patref() method may be used to get a reference to a pattern
object (pat_ref struct which is used for maps and acl storage) from
Lua by providing the reference name (filename for files, or prefix+name
for opt or virtual pattern references).

Lua documentation was updated.
2024-11-29 07:22:38 +01:00
Aurelien DARRAGON
956a25cf60 MINOR: hlua: add patref class
Implement patref class to expose pat_ref struct internal pattern struct
in lua. This is some prerequisite work needed to be able to manipulate
exisiting generic pattern object lists (acl/map) from Lua, because the Map
class can only be used to perform matching ops on Map files.
2024-11-29 07:22:32 +01:00
Aurelien DARRAGON
f72a66eef2 MINOR: pattern: publish event_hdl events on pat_ref updates
Now that PAT_REF events were defined in previous commit, let's actually
publish them from pattern API where relevant. Unlike server events,
pattern reference events are only published in the pat_ref subscriber's
list on purpose, because in some setups patref updates (updates performed
on a map for instance from action or cli) are very frequent, and we don't
want to impact pattern API performance just for that.

Moreover, as the main use case is to be able to subscribe to maps updates
from Lua, allowing a per-pattern reference registration is already enough.

No additional data is provided for such events (also for performance reason)

Care was taken not to publish events when the update doesn't affect the
live subset (the one targeted by curr_gen).
2024-11-29 07:22:25 +01:00
Aurelien DARRAGON
f7267bd315 MINOR: event_hdl: add PAT_REF events
This is some prerequisite work for implementing PAT_REF events.

In this commit we define the PAT_REF event_hdl family (which gets family
slot id #2), with the following supported events:

  - EVENT_HDL_SUB_PAT_REF_ADD: element was added to the current version of
    the pattern ref
  - EVENT_HDL_SUB_PAT_REF_DEL: element was deleted from the current
    version of the pattern ref
  - EVENT_HDL_SUB_PAT_REF_SET: element was modified in the current version
    of the pattern ref
  - EVENT_HDL_SUB_PAT_REF_COMMIT: pending element(s) was/were commited in
    the current version of the pattern ref
  - EVENT_HDL_SUB_PAT_REF_CLEAR: all elements were cleared from the
    current version of the pattern ref

The goal is to be able to track a pat_ref struct in order to be notified
when it is updated. For performance reasons, events from this family won't
provide any additional info, and will only be published in the pat_ref
subscription list. Indeed, pat_ref may be updated at a relatively high
frequency (or worse, batch work), so we cannot afford doing expensive
treatment for each update.
2024-11-29 07:22:18 +01:00
Frederic Lecaille
f8b697c19b BUG/MINOR: improve BBR throughput on very fast links
This patch fixes the loss of information when computing the delivery rate
(quic_cc_drs.c) on links with very low latency due to usage of 32bits
variables with the millisecond as precision.

Initialize the quic_conn task with TASK_F_WANTS_TIME flag ask it to ask
the scheduler to update the call date of this task. This allows this task to get
a nanosecond resolution on the call date calling task_mono_time(). This is enabled
only for congestion control algorithms with delivery rate estimation support
(BBR only at this time).

Store the send date with nanosecond precision of each TX packet into
->time_sent_ns new quic_tx_packet struct member to store the date a packet was
sent in nanoseconds thanks to task_mono_time().

Make use of this new timestamp by the delivery rate estimation algorithm (quic_cc_drs.c).

Rename current ->time_sent member from quic_tx_packet struct to ->time_sent_ms to
distinguish the unit used by this variable (millisecond) and update the code which
uses this variable. The logic found in quic_loss.c is not modified at all.

Must be backported to 3.1.
2024-11-28 21:39:05 +01:00
Aurelien DARRAGON
e37976166b MINOR: log: always consider "+M" option in lf_text_len()
Historically, when lf_text_len() or lf_text() were called with a NULL
string and "+M" option was set, "-" would be printed.

However, if the input string was simply an empty one with len > 0, then
nothing would be printed. This can happen if lf_text() is called with
an empty string because in this case len is set to size (indeed, for
performance reasons we don't pre-compute the length, we stop as soon
as we encounter a NULL-byte)

In practise, a lot of call places making use of lf_text() or lf_text_len()
try their best to avoid calling lf_text() with an empty string, and
instead explicitly call lf_text_len() with NULL as parameter to consider
the "+M" option.

But this is not enough, as shown in GH #2797, there could still be places
where lf_text() is called with an empty string. In such case, instead of
ignoring the "+M" option, let's check after _lf_text_len() if the returned
pointer differs from the original one. If both are equal, then it means
that nothing was printed (ie: result of empty string): in that case we
check the "+M" option to print "-" when possible.

While this commit seems harmless, it's probably better to avoid
backporting it since it could break existing applications relying on the
historical behavior.
2024-11-28 13:11:11 +01:00
Aurelien DARRAGON
3e470471b7 BUG/MINOR: log: fix lf_text() behavior with empty string
As reported by Baptiste in GH #2797, if a logformat alias leveraging
lf_text() ends up printing nothing (empty string), the whole logformat
evaluation stops, leading garbage log message.

This bug was introduced during 3.0 cycle in fcb7e4b ("MINOR: log: add
lf_rawtext{_len}() functions"). At that time I genuinely thought that
if strlcpy2() returned 0, it was due to a lack of space, actually
forgetting that the function may simply be called with an empty string.

Because of that, lf_text() would return NULL if called with an empty
string, and since all lf_*() helpers are expected to return NULL on
error, this explains why the logformat evaluation immediately stops in
this case.

To fix the issue, let's simply consider that strlcpy2() returning 0 is
not an error, like it was already the case before.

It should be backported in 3.1 and 3.0 with fcb7e4b.
2024-11-28 12:10:11 +01:00
Christopher Faulet
bc66d31985 MINOR: proxy: Add support of 421-Misdirected-Request in retry-on status
The "421" status can now be specified on retry-on directives. PR_RE_* flags
were updated to remains sorted.

This patch should fix the issue #2794. It is quite simple so it may safely
be backported to 3.1 if necessary.
2024-11-28 11:47:40 +01:00
Christopher Faulet
7262433183 BUG/MEDIUM: sock: Remove FD_POLL_HUP during connect() if FD_POLL_ERR is not set
epoll_wait() may return EPOLLUP and/or EPOLLRDHUP after an asynchronous
connect(), to indicate that the peer accepted the connection then
immediately closed before epoll_wait() returned. When this happens,
sock_conn_check() is called to check whether or not the connection correctly
established, and after that the receive channel of the socket is assumed to
already be closed. This lets haproxy send the request at best (if RDHUP and
not HUP) then immediately close.

Over the last two years, there were a few reports about this spuriously
happening on connections where network captures proved that the server had
not closed at all (and sometimes even received the request and responded to
it after haproxy had closed). The logs show that a successful connection is
immediately reported on error after the request was sent. After
investigations, it appeared that a EPOLLUP, or eventually a EPOLLRDHUP, can
be reported by epool_wait() during the connect() but in sock_conn_check(),
the connect() reports a success. So the connection is validated but the HUP
is handled on the first receive and an error is reported.

The same behavior could be observed on health-checks, leading HAProxy to
consider the server as DOWN while it is not.

The only explanation at this point is that it is a kernel bug, notably
because it does not even match the documentation for connect() nor epoll. In
addition for now it was only observed with Ubuntu kernels 5.4 and 5.15 and
was never reproduced on any other one.

We have no reproducer but here is the typical strace observed:

socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 114
fcntl(114, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
setsockopt(114, SOL_TCP, TCP_NODELAY, [1], 4) = 0
connect(114, {sa_family=AF_INET, sin_port=htons(11000), sin_addr=inet_addr("A.B.C.D")}, 16) = -1 EINPROGRESS (Operation now in progress)
epoll_ctl(19, EPOLL_CTL_ADD, 114, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP, data={u32=114, u64=114}}) = 0
epoll_wait(19, [{events=EPOLLIN, data={u32=15, u64=15}}, {events=EPOLLIN, data={u32=151, u64=151}}, {events=EPOLLIN, data={u32=59, u64=59}}, {events=EPOLLIN|EPOLLRDHUP, data={u32=114, u64=114}}], 200, 0) = 4
epoll_ctl(19, EPOLL_CTL_MOD, 114, {events=EPOLLOUT, data={u32=114, u64=114}}) = 0
epoll_wait(19, [{events=EPOLLOUT, data={u32=114, u64=114}}, {events=EPOLLIN, data={u32=15, u64=15}}, {events=EPOLLIN, data={u32=10, u64=10}}, {events=EPOLLIN, data={u32=165, u64=165}}], 200, 0) = 4
connect(114, {sa_family=AF_INET, sin_port=htons(11000), sin_addr=inet_addr("A.B.C.D")}, 16) = 0
sendto(114, "POST "..., 1009, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 1009
close(114)                              = 0

Some ressources about this issue:
  - https://www.spinics.net/lists/netdev/msg876470.html
  - https://github.com/haproxy/haproxy/issues/1863
  - https://github.com/haproxy/haproxy/issues/2368

So, to workaround the issue, we have decided to remove FD_POLL_HUP flag on
the FD during the connection establishement if FD_POLL_ERR is not reported
too in sock_conn_check(). This way, the call to connect() is able to
validate or reject the connection. At the end, if the HUP or RDHUP flags
were valid, either connect() would report the error itself, or the next
recv() would return 0 confirming the closure that the poller tried to
report. EPOLL_RDHUP is only an optimization to save a syscall anyway, and
this pattern is so rare that nobody will ever notice the extra call to
recv().

Please note that at least one reporter confirmed that using poll() instead
of epoll() also addressed the problem, so that can also be a temporary
workaround for those discovering the problem without the ability to
immediately upgrade.

The event is accounted via a COUNT_IF(), to be able to spot it in future
issue. Just in case.

This patch should fix the issue #1863 and #2368. It may be related
to #2751. It should be backported as far as 2.4. In 3.0 and below, the
COUNT_IF() must be removed.
2024-11-27 12:16:25 +01:00
Willy Tarreau
eea2697e95 DEV: patchbot: prepare for new version 3.2-dev
The bot will now load the prompt for the upcoming 3.2 version so we have
to rename the files and update their contents to match the current version.
2024-11-26 17:24:21 +01:00
Willy Tarreau
97d33abb23 MINOR: version: this is development again (3.2)
This basically reverts commit b629f366a7 ("MINOR: version: mention that
3.1 is stable now").
2024-11-26 17:21:16 +01:00
Aurelien DARRAGON
aa69a02d7f MEDIUM: pattern: always consider gen_id for pat_ref lookup operations
Historically, pat_ref lookup operations were performed on the whole
pat_ref elements list. As such, set, find and delete operations on a given
key would cause any matching element in pat_ref to be considered.

When prepare/commit operations were added, gen_id was impelemnted in
order to be able to work on a subset from pat_ref without impacting
the current (live) version from pat_ref, until a new subset is committed
to replace the current one.

While the logic was good, there remained a design flaw from the historical
implementation: indeed, legacy functions such as pat_ref_set(),
pat_ref_delete() and pat_ref_find_elt() kept performing the lookups on the
whole set of elements instead of considering only elements from the current
subset. Because of this, mixing new prepare/commit operations with legacy
operations could yield unexpected results.

For instance, before this commit:

  echo "add map #0 key oldvalue" | socat /tmp/ha.sock -
  echo "prepare map #0" | socat /tmp/ha.sock -
  New version created: 1
  echo "add map @1 #0 key newvalue" | socat /tmp/ha.sock -
  echo "del map #0 key" | socat /tmp/ha.sock -
  echo "commit map @1 #0" | socat /tmp/ha.sock -

  -> the result would be that "key" entry doesn't exist anymore after the
  commit, while we would expect the new value to be there instead.

Thanks to the previous commits, we may finally fix this issue: for set,
find_elt and delete operations, the current generation id is considered.

With the above example, it means that the "del map #0 key" would only
target elements from the current subset, thus elements in "version 1" of
the map would be immune to the delete (as we would expect it to work).
2024-11-26 16:12:31 +01:00
Aurelien DARRAGON
010c34b8c7 MEDIUM: pattern: consider gen_id in pat_ref_set_from_node()
Don't set all duplicates from a given node if they don't have the same
gen_id. Indeed, now we consider the gen_id to only work on the same
pattern ref revision.
2024-11-26 16:12:26 +01:00
Aurelien DARRAGON
4792f27892 MINOR: pattern: add pat_ref_gen_delete() function
pat_ref_gen_delete(ref, gen_id, key) tries to delete all samples belonging
to <gen_id> and matching <key> under <ref>

The goal is to be able to target a single subset from <ref>
2024-11-26 16:12:21 +01:00
Aurelien DARRAGON
a131c542a6 MINOR: pattern: add pat_ref_gen_find_elt() function
pat_ref_gen_find_elt(ref, gen_id, key) tries to find <elt> element
belonging to <gen_id> and matching <key> in <ref> reference.

The goal is to be able to target a single subset from <ref>
2024-11-26 16:12:16 +01:00
Aurelien DARRAGON
c9d6af3c6d MINOR: pattern: add pat_ref_gen_set() function
pat_ref_gen_set(ref, gen_id, value, err) modifies to <value> the sample
of all patterns matching <key> and belonging to <gen_id> (generation id)
under <ref>

The goal is to be able to target a single subset from <ref>
2024-11-26 16:12:11 +01:00
Aurelien DARRAGON
3d250b3be8 MINOR: pattern: split pat_ref_set()
split pat_ref_set() function in 2 distinct functions. Indeed, since
0844bed7d3 ("MEDIUM: map/acl: Improve pat_ref_set() efficiency (for
"set-map", "add-acl" action perfs)"), pat_ref_set() prototype was updated
to include an extra <elt> argument. But the logic behind is not explicit
because the function will not only try to set <elt>, but also its
duplicate (unlike pat_ref_set_elt() which only tries to update <elt>).

Thus, to make it clearer and better distinguish between the key-based
lookup version and the elt-based one, restotre pat_ref_set() previous
prototype and add a dedicated pat_ref_set_elt_duplicate() that takes
<elt> as argument and tries to update <elt> and all duplicates.
2024-11-26 16:12:05 +01:00
Willy Tarreau
4d58f521ee [RELEASE] Released version 3.2-dev0
Released version 3.2-dev0 with the following main changes :
    - exact copy of 3.1.0
v3.2-dev0
2024-11-26 15:33:57 +01:00
Willy Tarreau
f2b97918e8 [RELEASE] Released version 3.1.0
Released version 3.1.0 with the following main changes :
    - BUG/MAJOR: mux-h1: Properly handle wrapping on obuf when dumping the first-line
    - BUILD: activity/memprofile: fix a build warning in the posix_memalign handler
    - BUG/MINOR: quic: Avoid BUG_ON() on ->on_pkt_lost() BBR callback call
    - CI: update to the latest AWS-LC version
    - CI: update to the latest WolfSSL version
    - DOC: ot: mention planned deprecation of the OT filter
    - Revert "CI: update to the latest WolfSSL version"
    - CI: github: add a WolfSSL job which tries the latest version
    - BUILD: systemd: fix usage of reserved name "sun" in the address field
    - BUILD: init: use the more portable FD_CLOEXEC for /dev/null
    - CI: github: improve the Wolfssl job
    - CI: github: improve the AWS-LC job
    - BUG/MINOR: mux-quic: fix show quic report of QCS prepared bytes
    - BUG/MEDIUM: quic: fix sending performance due to qc_prep_pkts() return
    - MINOR: mux-quic: use sched call time for pacing
    - CI: github: allow to run the Illumos job manually
    - BUILD: tcp_sample: var_fc_counter defined but not used
    - CI: github: add 'workflow_dispatch' on remaining build jobs
    - DOC: config: refine a little bit the text on QUIC pacing
    - MINOR: proto_sockpair: send_fd_uxst: init iobuf, cmsghdr, cmsgbuf to zeros
    - MINOR: startup: rename on_new_child_failure to mworker_on_new_child_failure
    - REORG: startup: move on_new_child_failure in mworker.c
    - MINOR: startup: prefix prepare_master and run_master with mworker_*
    - REORG: startup: move mworker_prepare_master in mworker.c
    - MINOR: startup: keep updating verbosity modes only in haproxy.c
    - REORG: startup: move mworker_run_master and mworker_loop in mworker.c
    - REORG: startup: move mworker_reexec and mworker_reload in mworker.c
    - MINOR: startup: prefix apply_master_worker_mode with mworker_*
    - REORG: startup: move mworker_apply_master_worker_mode in mworker.c
    - MINOR: cfgparse-quic: strengthen quic-cc-algo parsing
    - BUG/MAJOR: quic: fix wrong packet building due to already acked frames
    - DEV: lags/show-sess-to-flags: Properly handle fd state on server side
    - BUG/MEDIUM: http-ana: Don't release too early the L7 buffer
    - MINOR: quic: make bbr consider the max window size setting
    - DOC: quic: Amend the pacing information about BBR.
    - BUG/MEDIUM: quic: prevent EMSGSIZE with GSO for larger bufsize
    - MINOR: cli: Add a "help" keyword to show sess
    - MINOR: cli/quic: Add a "help" keyword to show quic
    - DOC: management: mention "show sess help" and "show quic help"
    - DOC: install: update the list of supported versions
    - MINOR: version: mention that 3.1 is stable now
v3.1.0
2024-11-26 15:24:10 +01:00
Christopher Faulet
b629f366a7 MINOR: version: mention that 3.1 is stable now
This version will be maintained up to around Q1 2026. The INSTALL file
also mentions it.
2024-11-26 15:23:54 +01:00
Willy Tarreau
0a406054c7 DOC: install: update the list of supported versions
OpenSSL up to 3.4 was tested, and gcc up to 14 was tested, so let's
reflect this in the install doc.
2024-11-26 15:23:54 +01:00
Willy Tarreau
16022c2a7b DOC: management: mention "show sess help" and "show quic help"
These ones were recently added but we forgot to update the doc.
2024-11-26 15:00:51 +01:00
Olivier Houchard
4f973ab23a MINOR: cli/quic: Add a "help" keyword to show quic
Add a help keyword to show quic, that will provide a longer explanation
of all the available options than what is provided by the command "help".
2024-11-26 14:55:30 +01:00
Olivier Houchard
5288d0f47b MINOR: cli: Add a "help" keyword to show sess
Add a help keyword to show sess, that will provide a longer explanation of
all the available options than what is provided by the command "help".
2024-11-26 14:55:30 +01:00
Amaury Denoyelle
2fffd85b97 BUG/MEDIUM: quic: prevent EMSGSIZE with GSO for larger bufsize
A UDP datagram cannot be greater than 65535 bytes, as UDP length header
field is encoded on 2 bytes. As such, sendmsg() will reject a bigger
input with error EMSGSIZE. By default, this does not cause any issue as
QUIC datagrams are limited to 1.252 bytes and sent individually.

However, with GSO support, value bigger than 1.252 bytes are specified
on sendmsg(). If using a bufsize equal to or greater than 65535, syscall
could reject the input buffer with EMSGSIZE. As this value is not
expected, the connection is immediately closed by haproxy and the
transfer is interrupted.

This bug can easily reproduced by requesting a large object on loopback
interface and using a bufsize of 65535 bytes. In fact, the limit is
slightly less than 65535, as extra room is also needed for IP + UDP
headers.

Fix this by reducing the count of datagrams encoded in a single GSO
invokation via qc_prep_pkts(). Previously, it was set to 64 as specified
by man 7 udp. However, with 1252 datagrams, this is still too many.
Reduce it to a value of 52. Input to sendmsg will thus be restricted to
at most 65.104 bytes if last datagram is full.

If there is still data available for encoding in qc_prep_pkts(), they
will be written in a separate batch of datagrams. qc_send_ppkts() will
then loop over the whole QUIC Tx buffer and call sendmsg() for each
series of at most 52 datagrams.

This does not need to be backported.
2024-11-26 11:49:30 +01:00
Frederic Lecaille
3cee8d7830 DOC: quic: Amend the pacing information about BBR.
BBR handles itself its own burst size (mentioned as send_quantum in BBR RFC).
2024-11-26 08:00:58 +01:00
Frederic Lecaille
a3248a39eb MINOR: quic: make bbr consider the max window size setting
Limit the BBR congestion control window size as this is done for all the others
congestion control algorithms with tune.quic.frontend.default-max-window-size
or as first argument passed to "bbr" option for "quic-cc-algo".
2024-11-26 08:00:58 +01:00
Christopher Faulet
dc15581c02 BUG/MEDIUM: http-ana: Don't release too early the L7 buffer
In some cases, the buffer used to store the request to be able to perform a
L7 retry is released released too early, leading to a crash because a retry
is performed with an empty request.

First, there is a test on invalid 101 responses that may be caught by the
"junk-response" retry policy. Then, it is possible to get an error
(empty-response, bad status code...) after an interim response. In both
cases, the L7 buffer is already released while it should not.

To fix the issue, the L7 buffer is now released at the end of the
AN_RES_WAIT_HTTP analyser, but only when a response was successfully
received and processed. In all error cases, the stream is quickly released,
with the L7 buffer. So there is no leak and it is safer this way.

This patch may fix the issue #2793. It must be as far as 2.4.
2024-11-25 22:18:19 +01:00
Christopher Faulet
ceb80aed57 DEV: lags/show-sess-to-flags: Properly handle fd state on server side
It must be handled as an hexadecimal value.
2024-11-25 21:57:30 +01:00
Frederic Lecaille
96b2641fc8 BUG/MAJOR: quic: fix wrong packet building due to already acked frames
If a packet build was asked to probe the peer with frames which have just
been acked, the frames build run by qc_build_frms() could be cancelled  by
qc_stream_frm_is_acked() whose aim is to check that current frames to
be built have not been already acknowledged. In this case the packet build run
by qc_do_build_pkt() is not interrupted, leading to the build of an empty packet
which should be ack-eliciting.

This is a bug detected by the BUG_ON() statement in qc_do_build_pk():

	    BUG_ON(qel->pktns->tx.pto_probe &&
           !(pkt->flags & QUIC_FL_TX_PACKET_ACK_ELICITING));

Thank you to @Tristan971 for having reported this issue in GH #2709

This is an old bug which must be backported as far as 2.6.
2024-11-25 18:55:45 +01:00
Amaury Denoyelle
d41273c633 MINOR: cfgparse-quic: strengthen quic-cc-algo parsing
quic-cc-algo is a bind keyword which is used to specify the congestion
control algorithm. It is parsed via function bind_parse_quic_cc_algo().

The parsing function was too laxed as it used strncmp for algo token
matching. This could cause surprise if specifying an invalid algorithm
but starting identically to another entry. Especially if extra
parameters are specified in parenthesis, as in this case parameters
value will be completely ignored and default value used instead.

To fix this, convert algo argument to ist. Then, use istsplit() to
extract algo token from the optional extra arguments and compare the
whole value with isteq().
2024-11-25 16:19:54 +01:00
Valentine Krasnobaeva
3500865bc1 REORG: startup: move mworker_apply_master_worker_mode in mworker.c
mworker_apply_master_worker_mode() is called only in master-worker mode, so
let's move it mworker.c
2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva
3899a7ecaa MINOR: startup: prefix apply_master_worker_mode with mworker_*
This patch prepares the move of apply_master_worker_mode in mworker.c. So,
let's at first rename it to mworker_apply_master_worker_mode.
2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva
dee247c14e REORG: startup: move mworker_reexec and mworker_reload in mworker.c
Let's move mworker_reexec() and mworker_reload() in mworker.c. mworker_reload()
is called only within the functions, which are already in mworker.c. So, this
reorganization allows to declare mworker_reload() as a static.
2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva
0c7b93eb1d REORG: startup: move mworker_run_master and mworker_loop in mworker.c
mworker_run_master() is called only in master mode. mworker_loop() is static
and called only in mworker_run_master(). So let's move these both functions in
mworker.c.

We also need here to make run_thread_poll_loop() accessible from other units,
as it's used in mworker_loop().
2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva
56894db000 MINOR: startup: keep updating verbosity modes only in haproxy.c
This commit prepares the move of mworker_run_master() in mworker.c.

Let's remove from it's definition the code, which adjusts verbosity in
dependency of other global run time modes (daemon or foreground). This part
should stay in main(), where all verbosity modes are handeled for
different mode combinations.
2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva
7974089ac6 REORG: startup: move mworker_prepare_master in mworker.c
mworker_prepare_master() performs some preparation routines for the new worker
process, which will be forked during the startup. It's called only in
master-worker mode, so let's move it in mworker.c.
2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva
41cc1fe310 MINOR: startup: prefix prepare_master and run_master with mworker_*
This patch prepares the move of prepare_master() and run_master() definitions
into mworker.c. So, let's at first prefix its names with mworker_*.
2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva
af642420b4 REORG: startup: move on_new_child_failure in mworker.c
mworker_on_new_child_failure() performs some routines for the worker process,
if it has failed the reload. As it's called only in mworker_catch_sigchld()
from mworker.c, let's move mworker_on_new_child_failure() in mworker.c as well.
Like this it could also be declared as a static.
2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva
321c021a83 MINOR: startup: rename on_new_child_failure to mworker_on_new_child_failure
This patch prepares the moving of on_new_child_failure definition into
mworker.c. So, let's rename it accordingly and let's also update its
description.
2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva
10c14a1ed0 MINOR: proto_sockpair: send_fd_uxst: init iobuf, cmsghdr, cmsgbuf to zeros
In master-worker mode, worker process uses now send_fd_uxst() to send
'_send_status' command to master. Since refactoring, this started to trigger
the following Valgrind reports:

==810584== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==810584==    at 0x4AAC99D: __libc_sendmsg (sendmsg.c:28)
==810584==    by 0x4AAC99D: sendmsg (sendmsg.c:25)
==810584==    by 0x56350F: send_fd_uxst (proto_sockpair.c:271)
==810584==    by 0x3AA25C: main (haproxy.c:4151)
==810584==  Address 0x1ffefffbfe is on thread 1's stack
==810584==  in frame #1, created by send_fd_uxst (proto_sockpair.c:241)
==810584==
==810584== Syscall param sendmsg(msg.msg_control) points to uninitialised byte(s)
==810584==    at 0x4AAC99D: __libc_sendmsg (sendmsg.c:28)
==810584==    by 0x4AAC99D: sendmsg (sendmsg.c:25)
==810584==    by 0x56350F: send_fd_uxst (proto_sockpair.c:271)
==810584==    by 0x3AA25C: main (haproxy.c:4151)
==810584==  Address 0x1ffefffc14 is on thread 1's stack
==810584==  in frame #1, created by send_fd_uxst (proto_sockpair.c:241)
==810584==

So, let's initialize with zeros all buffers, which are passed to sendmsg
syscall(), used in send_fd_uxst() to avoid these Valgrind messages. They
increase Valgrind output and could make unnoticeable some other, more important
reports.
2024-11-25 15:20:24 +01:00
Willy Tarreau
7fb98e833c DOC: config: refine a little bit the text on QUIC pacing
The QUIC pacing options changed a few times during their development.
For example the unit is now in datagrams not bytes. Also a few
sentences were slightly ambiguous so let's reword this.

No backport is needed.
2024-11-25 14:54:16 +01:00
William Lallemand
dee3f4b3ff CI: github: add 'workflow_dispatch' on remaining build jobs
Add 'workflow_dispatch' on the remaining scheduled build jobs that does
not have it.

This keyword allows to start manually a job from the "Actions" interface
in github.
2024-11-25 14:03:13 +01:00
William Lallemand
da1331b0b5 BUILD: tcp_sample: var_fc_counter defined but not used
var_fc_counter is not used on Illumos and emit a warning

  src/tcp_sample.c:291:12: warning: ‘var_fc_counter’ defined but not used [-Wunused-function]
    291 | static int var_fc_counter(struct arg *args, char **err)
        |            ^~~~~~~~~~~~~~

Let's add an ifdef to build it.
2024-11-25 11:41:26 +01:00
William Lallemand
079193e375 CI: github: allow to run the Illumos job manually
Add the "workflow_dispatch" option to the Illumos CI so it can be run
manually from the github actions page.
2024-11-25 11:30:55 +01:00
Amaury Denoyelle
22bd92a87f MINOR: mux-quic: use sched call time for pacing
QUIC pacing was recently implemented to limit burst and improve overall
bandwidth. This is used only for MUX STREAM emission. Pacing requires
nanosecond resolution. As such, it used now_cpu_time() which relies on
clock_gettime() syscall.

The usage of clock_gettime() has several drawbacks :
* it is a syscall and thus requires a context-switch which may hurt
  performance
* it is not be available on all systems
* timestamp is retrieved multiple times during a single task execution,
  thus yielding different values which may tamper pacing calculation

Improve this by using task_mono_time() instead. This returns task call
time from the scheduler thread context. It requires the flag
TASK_F_WANTS_TIME on QUIC MUX tasklet to force the scheduler to update
call time with now_mono_time(). This solves every limitations listed
above :
* syscall invokation is only performed once before tasklet execution,
  thus reducing context-switch impact
* on non compatible system, a millisecond timer is used as a fallback
  which should ensure that pacing works decently for them
* timer value is now guaranteed to be fixed duing task execution
2024-11-25 11:21:45 +01:00
Amaury Denoyelle
044452546e BUG/MEDIUM: quic: fix sending performance due to qc_prep_pkts() return
qc_prep_pkts() is a QUIC transport level function which encodes one or
several datagrams in a buffer before sending them. It returns the number
of encoded datagram. This is especially important when pacing is used to
limit packet bursts.

This datagram accounting was not trivial as qc_prep_pkts() used several
code paths depending on the condition of the current encoded packet.
Thus, there were several places were the local variable dgram_cnt could
have been incremented. This was implemented by the following commit :

  commit 5cb8f8a6224db96f4386277c41ddae4a29a4130d
  MINOR: quic: support a max number of built packet per send iteration

However, there is a bug due to a missing increment when all frames from
the current QEL have been encoded. In this case, the encoding continue
in the same datagram to coalesce a futur packet. However, if this is the
last QEL, encoding loop will then break. As first_pkt is not NULL,
qc_txb_store() is called outside but dgram_cnt is yet not incremented.

In particular, this causes qc_prep_pkts() to return 0 when there is only
small STREAM frames to emit for application QEL. In qc_send(), this is
interpreted as a value which prevents further emission for the current
invokation. Thus, it may hurts performance, both without and with
pacing.

To fix this, removing multiple dgram_cnt increment. Now, it is modified
only in a single place which should cover every case, and render the
code easier to validate.

The most notable case where the bug is visible is when using cubic with
pacing without any burst, with quic-cc-algo cubic(,1). First, transfer
bandwidth in average was suboptimal, with significant variation. Worst,
it could sometimes fall dramatically for a particular stream without
recovering before returning to an expected level on the next one.

No need to backport.
2024-11-25 11:21:28 +01:00
Amaury Denoyelle
3704e0e174 BUG/MINOR: mux-quic: fix show quic report of QCS prepared bytes
On show quic, each MUX streams are listed with their various indicator
for buffering on Rx and Tx. In particular, txoff displays in parenthesis
the current level of data prepared by the upper stream instance not yet
emitted by QUIC transport layer.

This value is only accessible after a substract operation. However,
there was a typo which caused the result to be always 0. Fix this by
reusing the correct offsets in the calculation.

This should be backported up to 3.0.
2024-11-25 11:21:28 +01:00
William Lallemand
a7e5180c71 CI: github: improve the AWS-LC job
Like the WolfSSL job, improve the AWS-LC job by adding the socat command
so all SSL reg-tests can be run.
Also add gdb and output of corefiles.
2024-11-25 11:14:33 +01:00