1berry/ruby - ruby - Gitea : Git Mirror

Author	SHA1	Message	Date
Nobuyoshi Nakada	2e3f81838c	Align styles [ci skip]	2025-05-15 17:48:40 +09:00
Hiroya Fujinami	18f8c514ea	Fix memoization for the `/(...){0}/` case (#13169 ) In this case, the previous implementation counted an extra number of opcodes to cache and the matching was unstable on memoization. This patch is to fix that problem by not counting an number of opcodes to cache in the parentheses of `(...){0}`.	2025-04-24 12:03:24 +00:00
Daniel Colson	29b26fd3e7	Fix macro for disabled match cache The `MEMOIZE_LOOKAROUND_MATCH_CACHE_POINT` macro needs an argument otherwise we end up with: ``` ../regexec.c:3955:2: error: called object type 'void' is not a function or function pointer 3955 \| STACK_POS_END(stkp); \| ^~~~~~~~~~~~~~~~~~~ ../regexec.c:1680:41: note: expanded from macro 'STACK_POS_END' 1680 \| MEMOIZE_LOOKAROUND_MATCH_CACHE_POINT(k);\ \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ ../regexec.c:3969:7: error: called object type 'void' is not a function or function pointer 3969 \| STACK_POP_TIL_POS_NOT; \| ^~~~~~~~~~~~~~~~~~~~~ ../regexec.c:1616:41: note: expanded from macro 'STACK_POP_TIL_POS_NOT' 1616 \| MEMOIZE_LOOKAROUND_MATCH_CACHE_POINT(stk);\ \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ ``` The macro definition with the match cache enabled already has the correct argument. This one is for when the match cache is disabled (I had disabled it while trying to learn more about how it works.)	2025-04-13 11:44:49 +09:00
John Hawthorn	8409edc497	Fix regex timeout double-free after stack_double As of 10574857ce167869524b97ee862b610928f6272f, it's possible to crash on a double free due to `stk_alloc` AKA `msa->stack_p` being freed twice, once at the end of match_at and a second time in `FREE_MATCH_ARG` in the parent caller. Fixes [Bug #20886]	2024-11-11 23:33:21 -08:00
kojix2	550ac2f2ed	[DOC] Fix typos	2024-10-31 12:44:50 +09:00
Nobuyoshi Nakada	c94ea1cccb	Fix size modifier for `size_t`	2024-09-25 10:40:14 +09:00
Peter Zhu	7464514ca5	Fix memory leak in String#start_with? when regexp times out [Bug #20653] This commit refactors how Onigmo handles timeout. Instead of raising a timeout error, onig_search will return a ONIGERR_TIMEOUT which the caller can free memory, and then raise a timeout error. This fixes a memory leak in String#start_with when the regexp times out. For example: regex = Regexp.new("^#{"(a)" 10_000}x$", timeout: 0.000001) str = "a" * 1000000 + "x" 10.times do 100.times do str.start_with?(regex) rescue end puts `ps -o rss= -p #{$$}` end Before: 33216 51936 71152 81728 97152 103248 120384 133392 133520 133616 After: 14912 15376 15824 15824 16128 16128 16144 16144 16160 16160	2024-07-26 08:42:38 -04:00
Peter Zhu	10574857ce	Fix memory leak in Regexp capture group when timeout [Bug #20650] The capture group allocates memory that is leaked when it times out. For example: re = Regexp.new("^#{"(a)" 10_000}x$", timeout: 0.000001) str = "a" * 1000000 + "x" 10.times do 100.times do re =~ str rescue Regexp::TimeoutError end puts `ps -o rss= -p #{$$}` end Before: 34688 56416 78288 100368 120784 140704 161904 183568 204320 224800 After: 16288 16288 16880 16896 16912 16928 16944 17184 17184 17200	2024-07-25 09:23:49 -04:00
Daniel Colson	d292a9b98c	[Bug #20453 ] segfault in Regexp timeout https://bugs.ruby-lang.org/issues/20228 started freeing `stk_base` to avoid a memory leak. But `stk_base` is sometimes stack allocated (using `xalloca`), so the free only works if the regex stack has grown enough to hit `stack_double` (which uses `xmalloc` and `xrealloc`). To reproduce the problem on master and 3.3.1: ```ruby Regexp.timeout = 0.001 /^(a)x$/ =~ "a" 1000000 + "x"' ``` Some details about this potential fix: `stk_base == stk_alloc` on [init](`dde99215f2/regexec.c (L1153)`), so if `stk_base != stk_alloc` we can be sure we called [`stack_double`](`dde99215f2/regexec.c (L1210)`) and it's safe to free. It's also safe to free if we've [saved](`dde99215f2/regexec.c (L1187-L1189)`) the stack to `msa->stack_p`, since we do the `stk_base != stk_alloc` check before saving. This matches the check we do inside [`stack_double`](`dde99215f2/regexec.c (L1221)`)	2024-04-25 10:28:18 -04:00
Hiroshi SHIBATA	989a235580	Fix Use-After-Free issue for Regexp Co-authored-by: Isaac Peka <7493006+isaac-peka@users.noreply.github.com>	2024-04-23 19:16:08 +09:00
Isaac Peka	33e5b47c16	Fix handling of reg->dmin in Regex matching	2024-04-23 19:16:05 +09:00
Nobuyoshi Nakada	3a04ea2d03	[Bug #20305 ] Fix matching against an incomplete character When matching against an incomplete character, some `enclen` calls are expected not to exceed the limit, and some are expected to return the required length and then the results are checked if it exceeds.	2024-02-27 13:58:03 +09:00
Nobuyoshi Nakada	75aaeb35b8	[Bug #20239 ] Fix overflow at down-casting	2024-02-07 15:14:26 +09:00
Peter Zhu	1c120efe02	Fix memory leak in stk_base when Regexp timeout [Bug #20228] If rb_reg_check_timeout raises a Regexp::TimeoutError, then the stk_base will leak.	2024-02-02 10:39:42 -05:00
Hiroya Fujinami	3e6e3ca262	Correctly handle consecutive lookarounds (#9738 ) Fix [Bug #20207] Fix [Bug #20212] Handling consecutive lookarounds in init_cache_opcodes is buggy, so it causes invalid memory access reported in [Bug #20207] and [Bug #20212]. This fixes it by using recursive functions to detected lookarounds nesting correctly.	2024-01-29 23:51:26 +09:00
Hiroya Fujinami	597955aae8	Fix to work match cache with peek next optimization (#9459 )	2024-01-10 11:22:23 +09:00
Hiroya Fujinami	2571d5376a	Reduce `if` for decreasing counter on OP_REPEAT_INC (#9393 ) This commit also reduces the warning `'stkp' may be used uninitialized in this function`.	2023-12-30 01:08:51 +09:00
Hiroya Fujinami	bb59696614	Fix [Bug #20098 ]: set counter value for {n,m} repetition correctly (#9391 )	2023-12-29 19:30:24 +09:00
Hiroya Fujinami	d8702ddbfb	Fix [Bug #20083 ]: correct a cache point size for atomic groups (#9367 )	2023-12-28 23:20:03 +09:00
Alan Wu	9786b909f9	Fix regex match cache out-of-bounds access Previously the following read and wrote 1 byte out-of-bounds: $ valgrind ruby -e 'p /(\W+)[bx]\?/i.match? "aaaaaa aaaaaaaaa aaaa aaaaaaaa aaa aaaaxaaaaaaaaaaa aaaaa aaaaaaaaaaaa a ? aaa aaaa a ?"' 2> >(grep Invalid -A 30) Because of the `match_cache_point_index + 1` in memoize_extended_match_cache_point() and check_extended_match_cache_point(), we need one more byte of space.	2023-11-16 10:23:15 +01:00
Hiroya Fujinami	34cb174800	Optimize regexp matching for look-around and atomic groups (#7931 )	2023-10-30 13:10:42 +09:00
Peter Zhu	7193b404a1	Add function rb_reg_onig_match rb_reg_onig_match performs preparation, error handling, and cleanup for matching a regex against a string. This reduces repetitive code and removes the need for StringScanner to access internal data of regex.	2023-07-27 13:33:40 -04:00
Peter Zhu	58386814a7	Don't check for null pointer in calls to free According to the C99 specification section 7.20.3.2 paragraph 2: > If ptr is a null pointer, no action occurs. So we do not need to check that the pointer is a null pointer.	2023-06-30 09:13:31 -04:00
TSUYUSATO Kitsune	a5819b5b25	Allow the match cache optimization for atomic groups (#7804 )	2023-05-22 11:27:34 +09:00
TSUYUSATO Kitsune	93dd13d97a	Remove warnings and errors in `regexec.c` with `ONIG_DEBUG_...` macros (#7803 )	2023-05-13 10:04:28 +09:00
TSUYUSATO Kitsune	ac730d3e75	Delay start of the match cache optimization (#7738 )	2023-05-04 13:15:51 +09:00
TSUYUSATO Kitsune	a1c2c274ee	Refactor `Regexp#match` cache implementation (#7724 ) * Refactor Regexp#match cache implementation Improved variable and function names Fixed [Bug 19537] (Maybe fixed in https://github.com/ruby/ruby/pull/7694) * Add a comment of the glossary for "match cache" * Skip to reset match cache when no cache point on null check	2023-04-19 13:08:28 +09:00
Nobuyoshi Nakada	fac814c2dc	Fix `PLATFORM_GET_INC` On platforms where unaligned word access is not allowed, and if `sizeof(val)` and `sizeof(type)` differ: - `val` > `type`, `val` will be a garbage. - `val` < `type`, outside `val` will be clobbered.	2023-04-16 17:45:27 +09:00
Nobuyoshi Nakada	0ac3f2c20e	[Bug #19587 ] Fix `reset_match_cache` arguments	2023-04-12 18:35:32 +09:00
Nobuyoshi Nakada	1b697d7cb5	Constify	2023-04-12 18:35:32 +09:00
Nobuyoshi Nakada	2e1a95b569	Extract `bsearch_cache_index` function	2023-04-12 18:35:32 +09:00
TSUYUSATO Kitsune	dddc542e9b	[Bug #19476 ]: correct cache index computation for repetition (#7457 )	2023-03-13 18:31:13 +09:00
TSUYUSATO Kitsune	e22c4e8877	[Bug #19467 ] correct cache points and counting failure on `OP_ANYCHAR_STAR_PEEK_NEXT` (#7454 )	2023-03-13 15:46:41 +09:00
TSUYUSATO Kitsune	b726d60c98	Fix [Bug 19273], set correct value to `outer_repeat` on `OP_REPEAT` (#7035 )	2022-12-28 20:03:25 +09:00
Nobuyoshi Nakada	43f4093a31	Adjust style [ci skip]	2022-12-22 15:12:05 +09:00
TSUYUSATO Kitsune	fbedadb61f	Add `Regexp.linear_time?` (#6901 )	2022-12-14 12:57:14 +09:00
Yusuke Endoh	b8e542b463	Make absent operator work at the end of the input string https://bugs.ruby-lang.org/issues/19104#change-100542	2022-12-12 14:26:38 +09:00
TSUYUSATO Kitsune	189e3c0ada	Add default cases for cache point finding function	2022-11-17 23:19:17 +09:00
TSUYUSATO Kitsune	90bfac296e	Add OP_CCLASS_MB case	2022-11-17 23:19:17 +09:00
TSUYUSATO Kitsune	1dc4128e92	Reduce warnings	2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune	36ff0521c1	Use long instead of int	2022-11-09 23:21:26 +09:00
Yusuke Endoh	d868f4ca31	Check for integer overflow in the allocation of match_cache table	2022-11-09 23:21:26 +09:00
Yusuke Endoh	14845ab4ff	Ensure that the table size for CACHE_MATCH fits with int Currently, the keys for CACHE_MATCH are handled as an `int` type. So we should make sure the table size are smaller than the range of `int`.	2022-11-09 23:21:26 +09:00
Yusuke Endoh	537286d0bb	Prevent GCC warnings ``` regexec.c: In function ‘reset_match_cache’: regexec.c:1259:56: warning: suggest parentheses around ‘-’ inside ‘<<’ [-Wparentheses] 1259 \| match_cache[k1 >> 3] &= ((1 << (8 - (k2 & 7) - 1)) - 1 << ((k2 & 7) + 1)) \| ((1 << (k1 & 7)) - 1); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ regexec.c:1269:60: warning: suggest parentheses around ‘-’ inside ‘<<’ [-Wparentheses] 1269 \| match_cache[k2 >> 3] &= ((1 << (8 - (k2 & 7) - 1)) - 1 << ((k2 & 7) + 1)); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ regexec.c: In function ‘find_cache_index_table’: regexec.c:1192:11: warning: ‘m’ may be used uninitialized [-Wmaybe-uninitialized] 1192 \| if (!(0 <= m && m < num_cache_table && table[m].addr == p)) { \| ~~^~~~ regexec.c: In function ‘match_at’: regexec.c:1238:12: warning: ‘m1’ is used uninitialized [-Wuninitialized] 1238 \| if (table[m1].addr < pbegin && m1 + 1 < num_cache_table) m1++; \| ^ regexec.c:1218:39: note: ‘m1’ was declared here 1218 \| int l = 0, r = num_cache_table - 1, m1, m2; \| ^~ regexec.c:1239:12: warning: ‘m2’ is used uninitialized [-Wuninitialized] 1239 \| if (table[m2].addr > pend && m2 - 1 > 0) m2--; \| ^ regexec.c:1218:43: note: ‘m2’ was declared here 1218 \| int l = 0, r = num_cache_table - 1, m1, m2; \| ^~ ```	2022-11-09 23:21:26 +09:00
Yusuke Endoh	ff5dba8319	Return ONIGERR_MEMORY if it fails to allocate memory for cache_match_opt	2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune	a1c1fc558a	Revert "Refactor field names" This reverts commit 1e6673d6bbd2adbf555d82c7c0906ceb148ed6ee.	2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune	22294731a8	Refactor field names	2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune	ff2998a86c	Remove debug printf	2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune	37613fea16	Clear cache on OP_NULL_CHECK_END_MEMST	2022-11-09 23:21:26 +09:00
TSUYUSATO Kitsune	f25bb291b4	Support OP_REPEAT and OP_REPEAT_INC	2022-11-09 23:21:26 +09:00

1 2 3

143 Commits