1523 Commits

Author SHA1 Message Date
Victor Stinner
4767a6e31c
gh-122728: Fix SystemError in PyEval_GetLocals() (#122735)
Fix PyEval_GetLocals() to avoid SystemError ("bad argument to
internal function"). Don't redefine the 'ret' variable in the if
block.

Add an unit test on PyEval_GetLocals().
2024-08-06 23:01:44 +02:00
Mark Shannon
7aca84e557
GH-117224: Move the body of a few large-ish micro-ops into helper functions (GH-122601) 2024-08-02 16:31:17 +01:00
Brandt Bucher
15d4cd0967
GH-116090: Fire RAISE events from _FOR_ITER_TIER_TWO (GH-122413) 2024-07-29 12:17:47 -07:00
Mark Shannon
afb0aa6ed2
GH-121131: Clean up and fix some instrumented instructions. (GH-121132)
* Add support for 'prev_instr' to code generator and refactor some INSTRUMENTED instructions
2024-07-26 12:24:12 +01:00
Brandt Bucher
7b36b67b1e
GH-118093: Add tier two support to several instructions (GH-121884) 2024-07-18 14:24:58 -07:00
Mark Shannon
169324c27a
GH-120024: Use pointer for stack pointer (GH-121923) 2024-07-18 12:47:21 +01:00
Tian Gao
e65cb4c6f0
gh-118934: Make PyEval_GetLocals return borrowed reference (#119769)
Co-authored-by: Alyssa Coghlan <ncoghlan@gmail.com>
2024-07-16 12:17:47 -07:00
Michael Droettboom
d69529d31c
gh-121338: Remove #pragma optimize (#121340) 2024-07-08 08:48:42 -04:00
Sam Gross
8e8d202f55
gh-117139: Add _PyTuple_FromStackRefSteal and use it (#121244)
Avoids the extra conversion from stack refs to PyObjects.
2024-07-02 12:30:14 -04:00
Brandt Bucher
33903c53db
GH-116017: Get rid of _COLD_EXITs (GH-120960) 2024-07-01 13:17:40 -07:00
Ken Jin
e6543daf12
gh-117139: Fix a few wrong steals in bytecodes.c (GH-121127)
Fix a few wrong steals in bytecodes.c
2024-06-29 02:14:48 +08:00
Ken Jin
22b0de2755
gh-117139: Convert the evaluation stack to stack refs (#118450)
This PR sets up tagged pointers for CPython.

The general idea is to create a separate struct _PyStackRef for everything on the evaluation stack to store the bits. This forces the C compiler to warn us if we try to cast things or pull things out of the struct directly.

Only for free threading: We tag the low bit if something is deferred - that means we skip incref and decref operations on it. This behavior may change in the future if Mark's plans to defer all objects in the interpreter loop pans out.

This implies a strict stack reference discipline is required. ALL incref and decref operations on stackrefs must use the stackref variants. It is unsafe to untag something then do normal incref/decref ops on it.

The new incref and decref variants are called dup and close. They mimic a "handle" API operating on these stackrefs.

Please read Include/internal/pycore_stackref.h for more information!

---------

Co-authored-by: Mark Shannon <9448417+markshannon@users.noreply.github.com>
2024-06-27 03:10:43 +08:00
Irit Katriel
65a12c559c
gh-120834: fix type of *_iframe field in _PyGenObject_HEAD declaration (#120835) 2024-06-24 10:23:38 +01:00
Mark Shannon
9cefcc0ee7
GH-120507: Lower the BEFORE_WITH and BEFORE_ASYNC_WITH instructions. (#120640)
* Remove BEFORE_WITH and BEFORE_ASYNC_WITH instructions.

* Add LOAD_SPECIAL instruction

* Reimplement `with` and `async with` statements using LOAD_SPECIAL
2024-06-18 12:17:46 +01:00
Xie Yanbo
9e052619a6
Fix typos in documentation and comments (#119763) 2024-06-04 10:22:22 +00:00
Irit Katriel
6e9863d7a3
gh-118692: Avoid creating unnecessary StopIteration instances for monitoring (#119216) 2024-05-21 20:42:51 +00:00
Nikita Sobolev
a8e5fed100
gh-118613: Fix error handling of _PyEval_GetFrameLocals in ceval.c (#118614) 2024-05-06 10:34:56 +03:00
Tian Gao
b034f14a4b
gh-74929: Implement PEP 667 (GH-115153) 2024-05-04 12:12:10 +01:00
Mark Shannon
1ab6356ebe
GH-118095: Use broader specializations of CALL in tier 1, for better tier 2 support of calls. (GH-118322)
* Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations.

* Remove CALL_PY_WITH_DEFAULTS specialization

* Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize
2024-05-04 12:11:11 +01:00
Tian Gao
9c14ed0618
gh-107674: Improve performance of sys.settrace (GH-117133)
* Check tracing in RESUME_CHECK

* Only change to RESUME_CHECK if not tracing
2024-05-03 19:49:24 +01:00
Guido van Rossum
7d83f7bcc4
gh-118335: Configure Tier 2 interpreter at build time (#118339)
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.

We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.

On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.

In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
2024-04-30 18:26:34 -07:00
Dino Viehland
4a1cf66c5c
gh-117657: Fix small issues with instrumentation and TSAN (#118064)
Small TSAN fixups for instrumentation
2024-04-30 11:38:05 -07:00
Mark Shannon
3e06c7f719
GH-118095: Add dynamic exit support and FOR_ITER_GEN support to tier 2 (GH-118279) 2024-04-26 18:08:50 +01:00
Dino Viehland
07525c9a85
gh-116818: Make sys.settrace, sys.setprofile, and monitoring thread-safe (#116775)
Makes sys.settrace, sys.setprofile, and monitoring generally thread-safe.

Mostly uses a stop-the-world approach and synchronization around the code object's _co_instrumentation_version.  There may be a little bit of extra synchronization around the monitoring data that's required to be TSAN clean.
2024-04-19 14:47:42 -07:00
Guido van Rossum
40f4d641a9
GH-118036: Fix a bug with CALL_STAT_INC (#117933)
We were under-counting calls in `_PyEvalFramePushAndInit`
because the `CALL_STAT_INC` macro was redefined to a no-op
for the Tier 2 interpreter. The fix is not to `#undef` it at all.
This results in ~37% more "Frames pushed" reported
under "Call stats".
2024-04-18 07:59:02 -07:00
Jeff Glass
acf69e09c6
gh-115178: Add Counts of UOp Pairs to pystats (GH-115181) 2024-04-16 14:27:18 +01:00
Michael Droettboom
0edde64a41
GH-117457: Correct pystats uop "miss" counts (GH-117477) 2024-04-04 15:49:18 -07:00
Guido van Rossum
060a96f1a9
gh-116968: Reimplement Tier 2 counters (#117144)
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.

The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
2024-04-04 15:03:27 +00:00
Guido van Rossum
8eda146e87
Fix successor opcode name printing in Tier 2 DEOPT debug message (#117471) 2024-04-02 18:25:48 +00:00
Sam Gross
19c1dd60c5
gh-117323: Make cell thread-safe in free-threaded builds (#117330)
Use critical sections to lock around accesses to cell contents. The critical sections are no-ops in the default (with GIL) build.
2024-03-29 13:35:43 -04:00
Michael Droettboom
26d328b2ba
GH-117121: Add pystats to JIT builds (GH-117346) 2024-03-28 15:23:08 -07:00
Mark Shannon
bf82f77957
GH-116422: Tier2 hot/cold splitting (GH-116813)
Splits the "cold" path, deopts and exits, from the "hot" path, reducing the size of most jitted instructions, at the cost of slower exits.
2024-03-26 09:35:11 +00:00
Bogdan Romanyuk
a8e93d3dca
gh-115756: make PyCode_GetFirstFree an unstable API (GH-115781) 2024-03-19 09:20:38 +00:00
Guido van Rossum
76d0868907
Cleanup tier2 debug output (#116920)
Various tweaks, including a slight refactor of the special cases for `_PUSH_FRAME`/`_POP_FRAME` to show the actual operand emitted.
2024-03-18 11:08:43 -07:00
Tian Gao
7895a61168
gh-116098: Revert "gh-107674: Improve performance of sys.settrace (GH-114986)" (GH-116178)
Revert "gh-107674: Improve performance of `sys.settrace` (GH-114986)"

This reverts commit 0a61e237009bf6b833e13ac635299ee063377699.
2024-03-01 07:46:33 +01:00
Brandt Bucher
f0df35eeca
GH-115802: JIT "small" code for Windows (GH-115964) 2024-02-29 08:11:28 -08:00
Tian Gao
0a61e23700
gh-107674: Improve performance of sys.settrace (GH-114986) 2024-02-28 15:21:42 +00:00
Guido van Rossum
142502ea8d
Tier 2 cleanups and tweaks (#115534)
* Rename `_testinternalcapi.get_{uop,counter}_optimizer` to `new_*_optimizer`
* Use `_PyUOpName()` instead of` _PyOpcode_uop_name[]`
* Add `target` to executor iterator items -- `list(ex)` now returns `(opcode, oparg, target, operand)` quadruples
* Add executor methods `get_opcode()` and `get_oparg()` to get `vmdata.opcode`, `vmdata.oparg`
* Define a helper for printing uops, and unify various places where they are printed
* Add a hack to summarize_stats.py to fix legacy uop names (e.g. `POP_TOP` -> `_POP_TOP`)
* Define helpers in `test_opt.py` for accessing the set or list of opnames of an executor
2024-02-20 20:24:35 +00:00
Ken Jin
7a8c3ed43a
gh-115735: Fix current executor NULL before _START_EXECUTOR (#115736)
This fixes level 3 or higher lltrace debug output `--with-pydebug` runs.
2024-02-20 18:47:05 +00:00
Brett Simmers
0749244d13
gh-112175: Add eval_breaker to PyThreadState (#115194)
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.

The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
2024-02-20 09:57:48 -05:00
Mark Shannon
626c414995
GH-115457: Support splitting and replication of micro ops. (GH-115558) 2024-02-20 10:50:59 +00:00
Mark Shannon
7b21403ccd
GH-112354: Initial implementation of warm up on exits and trace-stitching (GH-114142) 2024-02-20 09:39:55 +00:00
Brandt Bucher
f6d9e5926b
GH-113464: Add a JIT backend for tier 2 (GH-113465)
Add an option (--enable-experimental-jit for configure-based builds
or --experimental-jit for PCbuild-based ones) to build an
*experimental* just-in-time compiler, based on copy-and-patch (https://fredrikbk.com/publications/copy-and-patch.pdf).

See Tools/jit/README.md for more information on how to install the required build-time tooling.
2024-01-28 18:48:48 -08:00
Brandt Bucher
30e6cbdba2
GH-113860: Get rid of _PyUOpExecutorObject (GH-113954) 2024-01-12 11:58:23 +00:00
Mark Shannon
0ae60b66de
GH-113486: Do not emit spurious PY_UNWIND events for optimized calls to classes. (GH-113680) 2024-01-05 09:45:22 +00:00
Mark Shannon
e96f26083b
GH-111485: Generate instruction and uop metadata (GH-113287) 2023-12-20 14:27:25 +00:00
Mark Shannon
6873555955
GH-112354: Treat _EXIT_TRACE like an unconditional side exit (GH-113104) 2023-12-14 14:26:44 +00:00
Serhiy Storchaka
1161c14e8c
gh-112716: Fix SystemError when __builtins__ is not a dict (GH-112770)
It was raised in two cases:
* in the import statement when looking up __import__
* in pickling some builtin type when looking up built-ins iter, getattr, etc.
2023-12-14 14:24:24 +02:00
Guido van Rossum
5b86644338
A smattering of cleanups in uop debug output and lltrace (#112980)
* Include destination T1 opcode in Error debug message
* Include destination T1 opcode in DEOPT debug message
* Remove obsolete comment from remove_unneeded_uops
* Change lltrace_instruction() to print caller's opcode/oparg
2023-12-11 16:42:30 -08:00
Serhiy Storchaka
8660fb7fd7
gh-112660: Do not clear arbitrary errors on import (GH-112661)
Previously arbitrary errors could be cleared during formatting error
messages for ImportError or AttributeError for modules. Now all
unexpected errors are reported.
2023-12-07 12:19:43 +02:00