haproxy

Author	SHA1	Message	Date
Aurelien DARRAGON	2021072391	MINOR: hlua_fcn: implement index and pair metamethods for patref class patref object may now leverage index and pair methamethods to list and access patref elements at a specific index (=key) Also, patref:is_map() method may be used to know if the patref stores acl (key only) or map-style (key:value) patterns.	2024-11-29 07:22:46 +01:00
Aurelien DARRAGON	31784efad2	MINOR: hlua: add core.get_patref method core.get_patref() method may be used to get a reference to a pattern object (pat_ref struct which is used for maps and acl storage) from Lua by providing the reference name (filename for files, or prefix+name for opt or virtual pattern references). Lua documentation was updated.	2024-11-29 07:22:38 +01:00
Aurelien DARRAGON	956a25cf60	MINOR: hlua: add patref class Implement patref class to expose pat_ref struct internal pattern struct in lua. This is some prerequisite work needed to be able to manipulate exisiting generic pattern object lists (acl/map) from Lua, because the Map class can only be used to perform matching ops on Map files.	2024-11-29 07:22:32 +01:00
Aurelien DARRAGON	f72a66eef2	MINOR: pattern: publish event_hdl events on pat_ref updates Now that PAT_REF events were defined in previous commit, let's actually publish them from pattern API where relevant. Unlike server events, pattern reference events are only published in the pat_ref subscriber's list on purpose, because in some setups patref updates (updates performed on a map for instance from action or cli) are very frequent, and we don't want to impact pattern API performance just for that. Moreover, as the main use case is to be able to subscribe to maps updates from Lua, allowing a per-pattern reference registration is already enough. No additional data is provided for such events (also for performance reason) Care was taken not to publish events when the update doesn't affect the live subset (the one targeted by curr_gen).	2024-11-29 07:22:25 +01:00
Aurelien DARRAGON	f7267bd315	MINOR: event_hdl: add PAT_REF events This is some prerequisite work for implementing PAT_REF events. In this commit we define the PAT_REF event_hdl family (which gets family slot id #2), with the following supported events: - EVENT_HDL_SUB_PAT_REF_ADD: element was added to the current version of the pattern ref - EVENT_HDL_SUB_PAT_REF_DEL: element was deleted from the current version of the pattern ref - EVENT_HDL_SUB_PAT_REF_SET: element was modified in the current version of the pattern ref - EVENT_HDL_SUB_PAT_REF_COMMIT: pending element(s) was/were commited in the current version of the pattern ref - EVENT_HDL_SUB_PAT_REF_CLEAR: all elements were cleared from the current version of the pattern ref The goal is to be able to track a pat_ref struct in order to be notified when it is updated. For performance reasons, events from this family won't provide any additional info, and will only be published in the pat_ref subscription list. Indeed, pat_ref may be updated at a relatively high frequency (or worse, batch work), so we cannot afford doing expensive treatment for each update.	2024-11-29 07:22:18 +01:00
Frederic Lecaille	f8b697c19b	BUG/MINOR: improve BBR throughput on very fast links This patch fixes the loss of information when computing the delivery rate (quic_cc_drs.c) on links with very low latency due to usage of 32bits variables with the millisecond as precision. Initialize the quic_conn task with TASK_F_WANTS_TIME flag ask it to ask the scheduler to update the call date of this task. This allows this task to get a nanosecond resolution on the call date calling task_mono_time(). This is enabled only for congestion control algorithms with delivery rate estimation support (BBR only at this time). Store the send date with nanosecond precision of each TX packet into ->time_sent_ns new quic_tx_packet struct member to store the date a packet was sent in nanoseconds thanks to task_mono_time(). Make use of this new timestamp by the delivery rate estimation algorithm (quic_cc_drs.c). Rename current ->time_sent member from quic_tx_packet struct to ->time_sent_ms to distinguish the unit used by this variable (millisecond) and update the code which uses this variable. The logic found in quic_loss.c is not modified at all. Must be backported to 3.1.	2024-11-28 21:39:05 +01:00
Aurelien DARRAGON	e37976166b	MINOR: log: always consider "+M" option in lf_text_len() Historically, when lf_text_len() or lf_text() were called with a NULL string and "+M" option was set, "-" would be printed. However, if the input string was simply an empty one with len > 0, then nothing would be printed. This can happen if lf_text() is called with an empty string because in this case len is set to size (indeed, for performance reasons we don't pre-compute the length, we stop as soon as we encounter a NULL-byte) In practise, a lot of call places making use of lf_text() or lf_text_len() try their best to avoid calling lf_text() with an empty string, and instead explicitly call lf_text_len() with NULL as parameter to consider the "+M" option. But this is not enough, as shown in GH #2797, there could still be places where lf_text() is called with an empty string. In such case, instead of ignoring the "+M" option, let's check after _lf_text_len() if the returned pointer differs from the original one. If both are equal, then it means that nothing was printed (ie: result of empty string): in that case we check the "+M" option to print "-" when possible. While this commit seems harmless, it's probably better to avoid backporting it since it could break existing applications relying on the historical behavior.	2024-11-28 13:11:11 +01:00
Aurelien DARRAGON	3e470471b7	BUG/MINOR: log: fix lf_text() behavior with empty string As reported by Baptiste in GH #2797, if a logformat alias leveraging lf_text() ends up printing nothing (empty string), the whole logformat evaluation stops, leading garbage log message. This bug was introduced during 3.0 cycle in fcb7e4b ("MINOR: log: add lf_rawtext{_len}() functions"). At that time I genuinely thought that if strlcpy2() returned 0, it was due to a lack of space, actually forgetting that the function may simply be called with an empty string. Because of that, lf_text() would return NULL if called with an empty string, and since all lf_*() helpers are expected to return NULL on error, this explains why the logformat evaluation immediately stops in this case. To fix the issue, let's simply consider that strlcpy2() returning 0 is not an error, like it was already the case before. It should be backported in 3.1 and 3.0 with fcb7e4b.	2024-11-28 12:10:11 +01:00
Christopher Faulet	bc66d31985	MINOR: proxy: Add support of 421-Misdirected-Request in retry-on status The "421" status can now be specified on retry-on directives. PR_RE_* flags were updated to remains sorted. This patch should fix the issue #2794. It is quite simple so it may safely be backported to 3.1 if necessary.	2024-11-28 11:47:40 +01:00
Christopher Faulet	7262433183	BUG/MEDIUM: sock: Remove FD_POLL_HUP during connect() if FD_POLL_ERR is not set epoll_wait() may return EPOLLUP and/or EPOLLRDHUP after an asynchronous connect(), to indicate that the peer accepted the connection then immediately closed before epoll_wait() returned. When this happens, sock_conn_check() is called to check whether or not the connection correctly established, and after that the receive channel of the socket is assumed to already be closed. This lets haproxy send the request at best (if RDHUP and not HUP) then immediately close. Over the last two years, there were a few reports about this spuriously happening on connections where network captures proved that the server had not closed at all (and sometimes even received the request and responded to it after haproxy had closed). The logs show that a successful connection is immediately reported on error after the request was sent. After investigations, it appeared that a EPOLLUP, or eventually a EPOLLRDHUP, can be reported by epool_wait() during the connect() but in sock_conn_check(), the connect() reports a success. So the connection is validated but the HUP is handled on the first receive and an error is reported. The same behavior could be observed on health-checks, leading HAProxy to consider the server as DOWN while it is not. The only explanation at this point is that it is a kernel bug, notably because it does not even match the documentation for connect() nor epoll. In addition for now it was only observed with Ubuntu kernels 5.4 and 5.15 and was never reproduced on any other one. We have no reproducer but here is the typical strace observed: socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 114 fcntl(114, F_SETFL, O_RDONLY\|O_NONBLOCK) = 0 setsockopt(114, SOL_TCP, TCP_NODELAY, [1], 4) = 0 connect(114, {sa_family=AF_INET, sin_port=htons(11000), sin_addr=inet_addr("A.B.C.D")}, 16) = -1 EINPROGRESS (Operation now in progress) epoll_ctl(19, EPOLL_CTL_ADD, 114, {events=EPOLLIN\|EPOLLOUT\|EPOLLRDHUP, data={u32=114, u64=114}}) = 0 epoll_wait(19, [{events=EPOLLIN, data={u32=15, u64=15}}, {events=EPOLLIN, data={u32=151, u64=151}}, {events=EPOLLIN, data={u32=59, u64=59}}, {events=EPOLLIN\|EPOLLRDHUP, data={u32=114, u64=114}}], 200, 0) = 4 epoll_ctl(19, EPOLL_CTL_MOD, 114, {events=EPOLLOUT, data={u32=114, u64=114}}) = 0 epoll_wait(19, [{events=EPOLLOUT, data={u32=114, u64=114}}, {events=EPOLLIN, data={u32=15, u64=15}}, {events=EPOLLIN, data={u32=10, u64=10}}, {events=EPOLLIN, data={u32=165, u64=165}}], 200, 0) = 4 connect(114, {sa_family=AF_INET, sin_port=htons(11000), sin_addr=inet_addr("A.B.C.D")}, 16) = 0 sendto(114, "POST "..., 1009, MSG_DONTWAIT\|MSG_NOSIGNAL, NULL, 0) = 1009 close(114) = 0 Some ressources about this issue: - https://www.spinics.net/lists/netdev/msg876470.html - https://github.com/haproxy/haproxy/issues/1863 - https://github.com/haproxy/haproxy/issues/2368 So, to workaround the issue, we have decided to remove FD_POLL_HUP flag on the FD during the connection establishement if FD_POLL_ERR is not reported too in sock_conn_check(). This way, the call to connect() is able to validate or reject the connection. At the end, if the HUP or RDHUP flags were valid, either connect() would report the error itself, or the next recv() would return 0 confirming the closure that the poller tried to report. EPOLL_RDHUP is only an optimization to save a syscall anyway, and this pattern is so rare that nobody will ever notice the extra call to recv(). Please note that at least one reporter confirmed that using poll() instead of epoll() also addressed the problem, so that can also be a temporary workaround for those discovering the problem without the ability to immediately upgrade. The event is accounted via a COUNT_IF(), to be able to spot it in future issue. Just in case. This patch should fix the issue #1863 and #2368. It may be related to #2751. It should be backported as far as 2.4. In 3.0 and below, the COUNT_IF() must be removed.	2024-11-27 12:16:25 +01:00
Willy Tarreau	eea2697e95	DEV: patchbot: prepare for new version 3.2-dev The bot will now load the prompt for the upcoming 3.2 version so we have to rename the files and update their contents to match the current version.	2024-11-26 17:24:21 +01:00
Willy Tarreau	97d33abb23	MINOR: version: this is development again (3.2) This basically reverts commit b629f366a7 ("MINOR: version: mention that 3.1 is stable now").	2024-11-26 17:21:16 +01:00
Aurelien DARRAGON	aa69a02d7f	MEDIUM: pattern: always consider gen_id for pat_ref lookup operations Historically, pat_ref lookup operations were performed on the whole pat_ref elements list. As such, set, find and delete operations on a given key would cause any matching element in pat_ref to be considered. When prepare/commit operations were added, gen_id was impelemnted in order to be able to work on a subset from pat_ref without impacting the current (live) version from pat_ref, until a new subset is committed to replace the current one. While the logic was good, there remained a design flaw from the historical implementation: indeed, legacy functions such as pat_ref_set(), pat_ref_delete() and pat_ref_find_elt() kept performing the lookups on the whole set of elements instead of considering only elements from the current subset. Because of this, mixing new prepare/commit operations with legacy operations could yield unexpected results. For instance, before this commit: echo "add map #0 key oldvalue" \| socat /tmp/ha.sock - echo "prepare map #0" \| socat /tmp/ha.sock - New version created: 1 echo "add map @1 #0 key newvalue" \| socat /tmp/ha.sock - echo "del map #0 key" \| socat /tmp/ha.sock - echo "commit map @1 #0" \| socat /tmp/ha.sock - -> the result would be that "key" entry doesn't exist anymore after the commit, while we would expect the new value to be there instead. Thanks to the previous commits, we may finally fix this issue: for set, find_elt and delete operations, the current generation id is considered. With the above example, it means that the "del map #0 key" would only target elements from the current subset, thus elements in "version 1" of the map would be immune to the delete (as we would expect it to work).	2024-11-26 16:12:31 +01:00
Aurelien DARRAGON	010c34b8c7	MEDIUM: pattern: consider gen_id in pat_ref_set_from_node() Don't set all duplicates from a given node if they don't have the same gen_id. Indeed, now we consider the gen_id to only work on the same pattern ref revision.	2024-11-26 16:12:26 +01:00
Aurelien DARRAGON	4792f27892	MINOR: pattern: add pat_ref_gen_delete() function pat_ref_gen_delete(ref, gen_id, key) tries to delete all samples belonging to <gen_id> and matching <key> under <ref> The goal is to be able to target a single subset from <ref>	2024-11-26 16:12:21 +01:00
Aurelien DARRAGON	a131c542a6	MINOR: pattern: add pat_ref_gen_find_elt() function pat_ref_gen_find_elt(ref, gen_id, key) tries to find <elt> element belonging to <gen_id> and matching <key> in <ref> reference. The goal is to be able to target a single subset from <ref>	2024-11-26 16:12:16 +01:00
Aurelien DARRAGON	c9d6af3c6d	MINOR: pattern: add pat_ref_gen_set() function pat_ref_gen_set(ref, gen_id, value, err) modifies to <value> the sample of all patterns matching <key> and belonging to <gen_id> (generation id) under <ref> The goal is to be able to target a single subset from <ref>	2024-11-26 16:12:11 +01:00
Aurelien DARRAGON	3d250b3be8	MINOR: pattern: split pat_ref_set() split pat_ref_set() function in 2 distinct functions. Indeed, since 0844bed7d3 ("MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs)"), pat_ref_set() prototype was updated to include an extra <elt> argument. But the logic behind is not explicit because the function will not only try to set <elt>, but also its duplicate (unlike pat_ref_set_elt() which only tries to update <elt>). Thus, to make it clearer and better distinguish between the key-based lookup version and the elt-based one, restotre pat_ref_set() previous prototype and add a dedicated pat_ref_set_elt_duplicate() that takes <elt> as argument and tries to update <elt> and all duplicates.	2024-11-26 16:12:05 +01:00
Willy Tarreau	4d58f521ee	[RELEASE] Released version 3.2-dev0 Released version 3.2-dev0 with the following main changes : - exact copy of 3.1.0 v3.2-dev0	2024-11-26 15:33:57 +01:00
Willy Tarreau	f2b97918e8	[RELEASE] Released version 3.1.0 Released version 3.1.0 with the following main changes : - BUG/MAJOR: mux-h1: Properly handle wrapping on obuf when dumping the first-line - BUILD: activity/memprofile: fix a build warning in the posix_memalign handler - BUG/MINOR: quic: Avoid BUG_ON() on ->on_pkt_lost() BBR callback call - CI: update to the latest AWS-LC version - CI: update to the latest WolfSSL version - DOC: ot: mention planned deprecation of the OT filter - Revert "CI: update to the latest WolfSSL version" - CI: github: add a WolfSSL job which tries the latest version - BUILD: systemd: fix usage of reserved name "sun" in the address field - BUILD: init: use the more portable FD_CLOEXEC for /dev/null - CI: github: improve the Wolfssl job - CI: github: improve the AWS-LC job - BUG/MINOR: mux-quic: fix show quic report of QCS prepared bytes - BUG/MEDIUM: quic: fix sending performance due to qc_prep_pkts() return - MINOR: mux-quic: use sched call time for pacing - CI: github: allow to run the Illumos job manually - BUILD: tcp_sample: var_fc_counter defined but not used - CI: github: add 'workflow_dispatch' on remaining build jobs - DOC: config: refine a little bit the text on QUIC pacing - MINOR: proto_sockpair: send_fd_uxst: init iobuf, cmsghdr, cmsgbuf to zeros - MINOR: startup: rename on_new_child_failure to mworker_on_new_child_failure - REORG: startup: move on_new_child_failure in mworker.c - MINOR: startup: prefix prepare_master and run_master with mworker_* - REORG: startup: move mworker_prepare_master in mworker.c - MINOR: startup: keep updating verbosity modes only in haproxy.c - REORG: startup: move mworker_run_master and mworker_loop in mworker.c - REORG: startup: move mworker_reexec and mworker_reload in mworker.c - MINOR: startup: prefix apply_master_worker_mode with mworker_* - REORG: startup: move mworker_apply_master_worker_mode in mworker.c - MINOR: cfgparse-quic: strengthen quic-cc-algo parsing - BUG/MAJOR: quic: fix wrong packet building due to already acked frames - DEV: lags/show-sess-to-flags: Properly handle fd state on server side - BUG/MEDIUM: http-ana: Don't release too early the L7 buffer - MINOR: quic: make bbr consider the max window size setting - DOC: quic: Amend the pacing information about BBR. - BUG/MEDIUM: quic: prevent EMSGSIZE with GSO for larger bufsize - MINOR: cli: Add a "help" keyword to show sess - MINOR: cli/quic: Add a "help" keyword to show quic - DOC: management: mention "show sess help" and "show quic help" - DOC: install: update the list of supported versions - MINOR: version: mention that 3.1 is stable now v3.1.0	2024-11-26 15:24:10 +01:00
Christopher Faulet	b629f366a7	MINOR: version: mention that 3.1 is stable now This version will be maintained up to around Q1 2026. The INSTALL file also mentions it.	2024-11-26 15:23:54 +01:00
Willy Tarreau	0a406054c7	DOC: install: update the list of supported versions OpenSSL up to 3.4 was tested, and gcc up to 14 was tested, so let's reflect this in the install doc.	2024-11-26 15:23:54 +01:00
Willy Tarreau	16022c2a7b	DOC: management: mention "show sess help" and "show quic help" These ones were recently added but we forgot to update the doc.	2024-11-26 15:00:51 +01:00
Olivier Houchard	4f973ab23a	MINOR: cli/quic: Add a "help" keyword to show quic Add a help keyword to show quic, that will provide a longer explanation of all the available options than what is provided by the command "help".	2024-11-26 14:55:30 +01:00
Olivier Houchard	5288d0f47b	MINOR: cli: Add a "help" keyword to show sess Add a help keyword to show sess, that will provide a longer explanation of all the available options than what is provided by the command "help".	2024-11-26 14:55:30 +01:00
Amaury Denoyelle	2fffd85b97	BUG/MEDIUM: quic: prevent EMSGSIZE with GSO for larger bufsize A UDP datagram cannot be greater than 65535 bytes, as UDP length header field is encoded on 2 bytes. As such, sendmsg() will reject a bigger input with error EMSGSIZE. By default, this does not cause any issue as QUIC datagrams are limited to 1.252 bytes and sent individually. However, with GSO support, value bigger than 1.252 bytes are specified on sendmsg(). If using a bufsize equal to or greater than 65535, syscall could reject the input buffer with EMSGSIZE. As this value is not expected, the connection is immediately closed by haproxy and the transfer is interrupted. This bug can easily reproduced by requesting a large object on loopback interface and using a bufsize of 65535 bytes. In fact, the limit is slightly less than 65535, as extra room is also needed for IP + UDP headers. Fix this by reducing the count of datagrams encoded in a single GSO invokation via qc_prep_pkts(). Previously, it was set to 64 as specified by man 7 udp. However, with 1252 datagrams, this is still too many. Reduce it to a value of 52. Input to sendmsg will thus be restricted to at most 65.104 bytes if last datagram is full. If there is still data available for encoding in qc_prep_pkts(), they will be written in a separate batch of datagrams. qc_send_ppkts() will then loop over the whole QUIC Tx buffer and call sendmsg() for each series of at most 52 datagrams. This does not need to be backported.	2024-11-26 11:49:30 +01:00
Frederic Lecaille	3cee8d7830	DOC: quic: Amend the pacing information about BBR. BBR handles itself its own burst size (mentioned as send_quantum in BBR RFC).	2024-11-26 08:00:58 +01:00
Frederic Lecaille	a3248a39eb	MINOR: quic: make bbr consider the max window size setting Limit the BBR congestion control window size as this is done for all the others congestion control algorithms with tune.quic.frontend.default-max-window-size or as first argument passed to "bbr" option for "quic-cc-algo".	2024-11-26 08:00:58 +01:00
Christopher Faulet	dc15581c02	BUG/MEDIUM: http-ana: Don't release too early the L7 buffer In some cases, the buffer used to store the request to be able to perform a L7 retry is released released too early, leading to a crash because a retry is performed with an empty request. First, there is a test on invalid 101 responses that may be caught by the "junk-response" retry policy. Then, it is possible to get an error (empty-response, bad status code...) after an interim response. In both cases, the L7 buffer is already released while it should not. To fix the issue, the L7 buffer is now released at the end of the AN_RES_WAIT_HTTP analyser, but only when a response was successfully received and processed. In all error cases, the stream is quickly released, with the L7 buffer. So there is no leak and it is safer this way. This patch may fix the issue #2793. It must be as far as 2.4.	2024-11-25 22:18:19 +01:00
Christopher Faulet	ceb80aed57	DEV: lags/show-sess-to-flags: Properly handle fd state on server side It must be handled as an hexadecimal value.	2024-11-25 21:57:30 +01:00
Frederic Lecaille	96b2641fc8	BUG/MAJOR: quic: fix wrong packet building due to already acked frames If a packet build was asked to probe the peer with frames which have just been acked, the frames build run by qc_build_frms() could be cancelled by qc_stream_frm_is_acked() whose aim is to check that current frames to be built have not been already acknowledged. In this case the packet build run by qc_do_build_pkt() is not interrupted, leading to the build of an empty packet which should be ack-eliciting. This is a bug detected by the BUG_ON() statement in qc_do_build_pk(): BUG_ON(qel->pktns->tx.pto_probe && !(pkt->flags & QUIC_FL_TX_PACKET_ACK_ELICITING)); Thank you to @Tristan971 for having reported this issue in GH #2709 This is an old bug which must be backported as far as 2.6.	2024-11-25 18:55:45 +01:00
Amaury Denoyelle	d41273c633	MINOR: cfgparse-quic: strengthen quic-cc-algo parsing quic-cc-algo is a bind keyword which is used to specify the congestion control algorithm. It is parsed via function bind_parse_quic_cc_algo(). The parsing function was too laxed as it used strncmp for algo token matching. This could cause surprise if specifying an invalid algorithm but starting identically to another entry. Especially if extra parameters are specified in parenthesis, as in this case parameters value will be completely ignored and default value used instead. To fix this, convert algo argument to ist. Then, use istsplit() to extract algo token from the optional extra arguments and compare the whole value with isteq().	2024-11-25 16:19:54 +01:00
Valentine Krasnobaeva	3500865bc1	REORG: startup: move mworker_apply_master_worker_mode in mworker.c mworker_apply_master_worker_mode() is called only in master-worker mode, so let's move it mworker.c	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	3899a7ecaa	MINOR: startup: prefix apply_master_worker_mode with mworker_* This patch prepares the move of apply_master_worker_mode in mworker.c. So, let's at first rename it to mworker_apply_master_worker_mode.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	dee247c14e	REORG: startup: move mworker_reexec and mworker_reload in mworker.c Let's move mworker_reexec() and mworker_reload() in mworker.c. mworker_reload() is called only within the functions, which are already in mworker.c. So, this reorganization allows to declare mworker_reload() as a static.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	0c7b93eb1d	REORG: startup: move mworker_run_master and mworker_loop in mworker.c mworker_run_master() is called only in master mode. mworker_loop() is static and called only in mworker_run_master(). So let's move these both functions in mworker.c. We also need here to make run_thread_poll_loop() accessible from other units, as it's used in mworker_loop().	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	56894db000	MINOR: startup: keep updating verbosity modes only in haproxy.c This commit prepares the move of mworker_run_master() in mworker.c. Let's remove from it's definition the code, which adjusts verbosity in dependency of other global run time modes (daemon or foreground). This part should stay in main(), where all verbosity modes are handeled for different mode combinations.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	7974089ac6	REORG: startup: move mworker_prepare_master in mworker.c mworker_prepare_master() performs some preparation routines for the new worker process, which will be forked during the startup. It's called only in master-worker mode, so let's move it in mworker.c.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	41cc1fe310	MINOR: startup: prefix prepare_master and run_master with mworker_* This patch prepares the move of prepare_master() and run_master() definitions into mworker.c. So, let's at first prefix its names with mworker_*.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	af642420b4	REORG: startup: move on_new_child_failure in mworker.c mworker_on_new_child_failure() performs some routines for the worker process, if it has failed the reload. As it's called only in mworker_catch_sigchld() from mworker.c, let's move mworker_on_new_child_failure() in mworker.c as well. Like this it could also be declared as a static.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	321c021a83	MINOR: startup: rename on_new_child_failure to mworker_on_new_child_failure This patch prepares the moving of on_new_child_failure definition into mworker.c. So, let's rename it accordingly and let's also update its description.	2024-11-25 15:20:24 +01:00
Valentine Krasnobaeva	10c14a1ed0	MINOR: proto_sockpair: send_fd_uxst: init iobuf, cmsghdr, cmsgbuf to zeros In master-worker mode, worker process uses now send_fd_uxst() to send '_send_status' command to master. Since refactoring, this started to trigger the following Valgrind reports: ==810584== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s) ==810584== at 0x4AAC99D: __libc_sendmsg (sendmsg.c:28) ==810584== by 0x4AAC99D: sendmsg (sendmsg.c:25) ==810584== by 0x56350F: send_fd_uxst (proto_sockpair.c:271) ==810584== by 0x3AA25C: main (haproxy.c:4151) ==810584== Address 0x1ffefffbfe is on thread 1's stack ==810584== in frame #1, created by send_fd_uxst (proto_sockpair.c:241) ==810584== ==810584== Syscall param sendmsg(msg.msg_control) points to uninitialised byte(s) ==810584== at 0x4AAC99D: __libc_sendmsg (sendmsg.c:28) ==810584== by 0x4AAC99D: sendmsg (sendmsg.c:25) ==810584== by 0x56350F: send_fd_uxst (proto_sockpair.c:271) ==810584== by 0x3AA25C: main (haproxy.c:4151) ==810584== Address 0x1ffefffc14 is on thread 1's stack ==810584== in frame #1, created by send_fd_uxst (proto_sockpair.c:241) ==810584== So, let's initialize with zeros all buffers, which are passed to sendmsg syscall(), used in send_fd_uxst() to avoid these Valgrind messages. They increase Valgrind output and could make unnoticeable some other, more important reports.	2024-11-25 15:20:24 +01:00
Willy Tarreau	7fb98e833c	DOC: config: refine a little bit the text on QUIC pacing The QUIC pacing options changed a few times during their development. For example the unit is now in datagrams not bytes. Also a few sentences were slightly ambiguous so let's reword this. No backport is needed.	2024-11-25 14:54:16 +01:00
William Lallemand	dee3f4b3ff	CI: github: add 'workflow_dispatch' on remaining build jobs Add 'workflow_dispatch' on the remaining scheduled build jobs that does not have it. This keyword allows to start manually a job from the "Actions" interface in github.	2024-11-25 14:03:13 +01:00
William Lallemand	da1331b0b5	BUILD: tcp_sample: var_fc_counter defined but not used var_fc_counter is not used on Illumos and emit a warning src/tcp_sample.c:291:12: warning: ‘var_fc_counter’ defined but not used [-Wunused-function] 291 \| static int var_fc_counter(struct arg args, char *err) \| ^~~~~~~~~~~~~~ Let's add an ifdef to build it.	2024-11-25 11:41:26 +01:00
William Lallemand	079193e375	CI: github: allow to run the Illumos job manually Add the "workflow_dispatch" option to the Illumos CI so it can be run manually from the github actions page.	2024-11-25 11:30:55 +01:00
Amaury Denoyelle	22bd92a87f	MINOR: mux-quic: use sched call time for pacing QUIC pacing was recently implemented to limit burst and improve overall bandwidth. This is used only for MUX STREAM emission. Pacing requires nanosecond resolution. As such, it used now_cpu_time() which relies on clock_gettime() syscall. The usage of clock_gettime() has several drawbacks : * it is a syscall and thus requires a context-switch which may hurt performance * it is not be available on all systems * timestamp is retrieved multiple times during a single task execution, thus yielding different values which may tamper pacing calculation Improve this by using task_mono_time() instead. This returns task call time from the scheduler thread context. It requires the flag TASK_F_WANTS_TIME on QUIC MUX tasklet to force the scheduler to update call time with now_mono_time(). This solves every limitations listed above : * syscall invokation is only performed once before tasklet execution, thus reducing context-switch impact * on non compatible system, a millisecond timer is used as a fallback which should ensure that pacing works decently for them * timer value is now guaranteed to be fixed duing task execution	2024-11-25 11:21:45 +01:00
Amaury Denoyelle	044452546e	BUG/MEDIUM: quic: fix sending performance due to qc_prep_pkts() return qc_prep_pkts() is a QUIC transport level function which encodes one or several datagrams in a buffer before sending them. It returns the number of encoded datagram. This is especially important when pacing is used to limit packet bursts. This datagram accounting was not trivial as qc_prep_pkts() used several code paths depending on the condition of the current encoded packet. Thus, there were several places were the local variable dgram_cnt could have been incremented. This was implemented by the following commit : commit 5cb8f8a6224db96f4386277c41ddae4a29a4130d MINOR: quic: support a max number of built packet per send iteration However, there is a bug due to a missing increment when all frames from the current QEL have been encoded. In this case, the encoding continue in the same datagram to coalesce a futur packet. However, if this is the last QEL, encoding loop will then break. As first_pkt is not NULL, qc_txb_store() is called outside but dgram_cnt is yet not incremented. In particular, this causes qc_prep_pkts() to return 0 when there is only small STREAM frames to emit for application QEL. In qc_send(), this is interpreted as a value which prevents further emission for the current invokation. Thus, it may hurts performance, both without and with pacing. To fix this, removing multiple dgram_cnt increment. Now, it is modified only in a single place which should cover every case, and render the code easier to validate. The most notable case where the bug is visible is when using cubic with pacing without any burst, with quic-cc-algo cubic(,1). First, transfer bandwidth in average was suboptimal, with significant variation. Worst, it could sometimes fall dramatically for a particular stream without recovering before returning to an expected level on the next one. No need to backport.	2024-11-25 11:21:28 +01:00
Amaury Denoyelle	3704e0e174	BUG/MINOR: mux-quic: fix show quic report of QCS prepared bytes On show quic, each MUX streams are listed with their various indicator for buffering on Rx and Tx. In particular, txoff displays in parenthesis the current level of data prepared by the upper stream instance not yet emitted by QUIC transport layer. This value is only accessible after a substract operation. However, there was a typo which caused the result to be always 0. Fix this by reusing the correct offsets in the calculation. This should be backported up to 3.0.	2024-11-25 11:21:28 +01:00
William Lallemand	a7e5180c71	CI: github: improve the AWS-LC job Like the WolfSSL job, improve the AWS-LC job by adding the socat command so all SSL reg-tests can be run. Also add gdb and output of corefiles.	2024-11-25 11:14:33 +01:00

1 2 3 4 5 ...

23562 Commits