42 Commits

Author SHA1 Message Date
Kirill Podoprigora
e4561e0501
gh-115778: Add tierN annotation for instruction definitions (#115815)
This replaces the old `TIER_{ONE,TWO}_ONLY` macros. Note that `specialized` implies `tier1`.

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-02-23 17:31:57 +00:00
Brett Simmers
0749244d13
gh-112175: Add eval_breaker to PyThreadState (#115194)
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.

The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
2024-02-20 09:57:48 -05:00
Mark Shannon
7b21403ccd
GH-112354: Initial implementation of warm up on exits and trace-stitching (GH-114142) 2024-02-20 09:39:55 +00:00
Sam Gross
5903190727
gh-115103: Implement delayed memory reclamation (QSBR) (#115180)
This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and
quiescent state based reclamation (QSBR). The API provides a mechanism
for callers to detect when it is safe to free memory that may be
concurrently accessed by readers.
2024-02-16 15:25:19 -05:00
Mark Shannon
2ff072f21f
Delete unused macro (GH-114238) 2024-01-18 15:49:50 +00:00
Mark Shannon
6873555955
GH-112354: Treat _EXIT_TRACE like an unconditional side exit (GH-113104) 2023-12-14 14:26:44 +00:00
Michael Droettboom
dfaa9e060b
gh-113010: Don't decrement deferred in pystats (#113032)
This fixes a recently introduced bug where the deferred count is being unnecessarily decremented to counteract an increment elsewhere that is no longer happening. This caused the values to flip around to "very large" 64-bit numbers.
2023-12-12 21:17:08 +00:00
Guido van Rossum
5b86644338
A smattering of cleanups in uop debug output and lltrace (#112980)
* Include destination T1 opcode in Error debug message
* Include destination T1 opcode in DEOPT debug message
* Remove obsolete comment from remove_unneeded_uops
* Change lltrace_instruction() to print caller's opcode/oparg
2023-12-11 16:42:30 -08:00
Guido van Rossum
8deb8bc2e5
gh-112287: Speed up Tier 2 (uop) interpreter a little (#112286)
This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.

This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.

The rest of the approach is to only load
`oparg` and `operand` in cases that use them.
The code generator know when these are used.

For `oparg`, it will conditionally emit
```
oparg = CURRENT_OPARG();
```
at the top of the case block.
(The `oparg` variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)

For `operand`, it will use `CURRENT_OPERAND()` directly
instead of referencing the `operand` variable,
which no longer exists.
(There is only one place where this will be used.)
2023-11-20 11:25:32 -08:00
Guido van Rossum
7e135a48d6
gh-111520: Integrate the Tier 2 interpreter in the Tier 1 interpreter (#111428)
- There is no longer a separate Python/executor.c file.
- Conventions in Python/bytecodes.c are slightly different -- don't use `goto error`,
  you must use `GOTO_ERROR(error)` (same for others like `unused_local_error`).
- The `TIER_ONE` and `TIER_TWO` symbols are only valid in the generated (.c.h) files.
- In Lib/test/support/__init__.py, `Py_C_RECURSION_LIMIT` is imported from `_testcapi`.
- On Windows, in debug mode, stack allocation grows from 8MiB to 12MiB.
- **Beware!** This changes the env vars to enable uops and their debugging
  to `PYTHON_UOPS` and `PYTHON_LLTRACE`.
2023-11-01 13:13:02 -07:00
Mark Shannon
d27acd4461
GH-111485: Increment next_instr consistently at the start of the instruction. (GH-111486) 2023-10-31 10:09:54 +00:00
Irit Katriel
67a91f78e4
gh-109094: replace frame->prev_instr by frame->instr_ptr (#109095) 2023-10-26 13:43:10 +00:00
Mark Shannon
19b7ead5eb
GH-109214: Convert _SAVE_CURRENT_IP to _SET_IP in tier 2 trace creation. (GH-110755) 2023-10-12 10:34:32 +01:00
Mark Shannon
bf4bc36069
GH-109369: Merge all eval-breaker flags and monitoring version into one word. (GH-109846) 2023-10-04 16:09:48 +01:00
Brandt Bucher
22e65eecaa
GH-105848: Replace KW_NAMES + CALL with LOAD_CONST + CALL_KW (GH-109300) 2023-09-13 10:25:45 -07:00
Mark Shannon
0858328ca2
GH-108614: Add RESUME_CHECK instruction (GH-108630) 2023-09-07 14:39:03 +01:00
Victor Stinner
a0773b89df
gh-108753: Enhance pystats (#108754)
Statistics gathering is now off by default. Use the "-X pystats"
command line option or set the new PYTHONSTATS environment variable
to 1 to turn statistics gathering on at Python startup.

Statistics are no longer dumped at exit if statistics gathering was
off or statistics have been cleared.

Changes:

* Add PYTHONSTATS environment variable.
* sys._stats_dump() now returns False if statistics are not dumped
  because they are all equal to zero.
* Add PyConfig._pystats member.
* Add tests on sys functions and on setting PyConfig._pystats to 1.
* Add Include/cpython/pystats.h and Include/internal/pycore_pystats.h
  header files.
* Rename '_py_stats' variable to '_Py_stats'.
* Exclude Include/cpython/pystats.h from the Py_LIMITED_API.
* Move pystats.h include from object.h to Python.h.
* Add _Py_StatsOn() and _Py_StatsOff() functions. Remove
  '_py_stats_struct' variable from the API: make it static in
  specialize.c.
* Document API in Include/pystats.h and Include/cpython/pystats.h.
* Complete pystats documentation in Doc/using/configure.rst.
* Don't write "all zeros" stats: if _stats_off() and _stats_clear()
  or _stats_dump() were called.
* _PyEval_Fini() now always call _Py_PrintSpecializationStats() which
  does nothing if stats are all zeros.

Co-authored-by: Michael Droettboom <mdboom@gmail.com>
2023-09-06 15:54:59 +00:00
Mark Shannon
059bd4d299
GH-108614: Remove non-debug uses of #if TIER_ONE and #if TIER_TWO from _POP_FRAME op. (GH-108685) 2023-08-31 11:34:52 +01:00
Guido van Rossum
61c7249759
gh-106581: Project through calls (#108067)
This finishes the work begun in gh-107760. When, while projecting a superblock, we encounter a call to a short, simple function, the superblock will now enter the function using `_PUSH_FRAME`, continue through it, and leave it using `_POP_FRAME`, and then continue through the original code. Multiple frame pushes and pops are even possible. It is also possible to stop appending to the superblock in the middle of a called function, when running out of space or encountering an unsupported bytecode.
2023-08-17 11:29:58 -07:00
Mark Shannon
006e44f950
GH-108035: Remove the _PyCFrame struct as it is no longer needed for performance. (GH-108036) 2023-08-17 11:16:03 +01:00
Guido van Rossum
dc8fdf5fd5
gh-106581: Split CALL_PY_EXACT_ARGS into uops (#107760)
* Split `CALL_PY_EXACT_ARGS` into uops

This is only the first step for doing `CALL` in Tier 2.
The next step involves tracing into the called code object and back.
After that we'll have to do the remaining `CALL` specialization.
Finally we'll have to deal with `KW_NAMES`.

Note: this moves setting `frame->return_offset` directly in front of
`DISPATCH_INLINED()`, to make it easier to move it into `_PUSH_FRAME`.
2023-08-16 16:26:43 -07:00
Brandt Bucher
8f4de57699
GH-106701: Move _PyUopExecute to Python/executor.c (GH-106924) 2023-07-20 20:37:19 +00:00
Irit Katriel
40f3f11a77
gh-105481: Generate the opcode lists in dis from data extracted from bytecodes.c (#106758) 2023-07-18 19:42:44 +01:00
Guido van Rossum
2b94a05a0e
gh-106581: Add 10 new opcodes by allowing assert(kwnames == NULL) (#106707)
By turning `assert(kwnames == NULL)` into a macro that is not in the "forbidden" list, many instructions that formerly were skipped because they contained such an assert (but no other mention of `kwnames`) are now supported in Tier 2. This covers 10 instructions in total (all specializations of `CALL` that invoke some C code):
- `CALL_NO_KW_TYPE_1`
- `CALL_NO_KW_STR_1`
- `CALL_NO_KW_TUPLE_1`
- `CALL_NO_KW_BUILTIN_O`
- `CALL_NO_KW_BUILTIN_FAST`
- `CALL_NO_KW_LEN`
- `CALL_NO_KW_ISINSTANCE`
- `CALL_NO_KW_METHOD_DESCRIPTOR_O`
- `CALL_NO_KW_METHOD_DESCRIPTOR_NOARGS`
- `CALL_NO_KW_METHOD_DESCRIPTOR_FAST`
2023-07-17 11:02:58 -07:00
Mark Shannon
e5862113dd
GH-104584: Fix ENTER_EXECUTOR (GH-106141)
* Check eval-breaker in ENTER_EXECUTOR.

* Make sure that frame->prev_instr is set before entering executor.
2023-07-03 21:28:27 +01:00
Guido van Rossum
6b5166fb12
gh-104584: Change DEOPT_IF in uops executor (#106146)
This effectively reverts bb578a0, restoring the original DEOPT_IF() macro in ceval_macros.h, and redefining it in the Tier 2 interpreter. We can get rid of the PREDICTED() macros there as well!
2023-06-27 14:17:41 -07:00
Guido van Rossum
bb578a0c30
gh-104584: Fix assert in DEOPT macro -- should fix buildbot (#106131) 2023-06-27 07:02:51 -07:00
Guido van Rossum
51fc725117
gh-104584: Baby steps towards generating and executing traces (#105924)
Added a new, experimental, tracing optimizer and interpreter (a.k.a. "tier 2"). This currently pessimizes, so don't use yet -- this is infrastructure so we can experiment with optimizing passes. To enable it, pass ``-Xuops`` or set ``PYTHONUOPS=1``. To get debug output, set ``PYTHONUOPSDEBUG=N`` where ``N`` is a debug level (0-4, where 0 is no debug output and 4 is excessively verbose).

All of this code is likely to change dramatically before the 3.13 feature freeze. But this is a first step.
2023-06-26 19:02:57 -07:00
Irit Katriel
d1b0297d3e
gh-105481: add HAS_JUMP flag to opcode metadata (#105791) 2023-06-14 23:14:22 +00:00
Mark Shannon
7199584ac8
GH-100987: Allow objects other than code objects as the "executable" of an internal frame. (GH-105727)
* Add table describing possible executable classes for out-of-process debuggers.

* Remove shim code object creation code as it is no longer needed.

* Make lltrace a bit more robust w.r.t. non-standard frames.
2023-06-14 13:46:37 +01:00
Irit Katriel
be2779c0cb
gh-105481: add flags to each instr in the opcode metadata table, to replace opcode.hasarg/hasname/hasconst (#105482) 2023-06-13 21:42:03 +01:00
Mark Shannon
064de0e3fc
GH-104610: Remove the use of PREDICT macros. (GH-104651) 2023-06-07 17:04:53 +01:00
Mark Shannon
4bfa01b9d9
GH-104584: Plugin optimizer API (GH-105100) 2023-06-02 11:46:18 +01:00
Mark Shannon
68b5f08b72
GH-104580: Don't cache eval breaker in interpreter (GH-104581)
Move eval-breaker to the front of the interpreter state.
2023-05-18 10:08:33 +01:00
Brandt Bucher
1eb950ca55
GH-104405: Add missing PEP 523 checks (GH-104406) 2023-05-12 22:23:13 +00:00
Mark Shannon
45f5aa8fc7
GH-103082: Filter LINE events in VM, to simplify tool implementation. (GH-104387)
When monitoring LINE events, instrument all instructions that can have a predecessor on a different line.
Then check that the a new line has been hit in the instrumentation code.
This brings the behavior closer to that of 3.11, simplifying implementation and porting of tools.
2023-05-12 12:21:20 +01:00
Mark Shannon
411b169281
GH-103082: Implementation of PEP 669: Low Impact Monitoring for CPython (GH-103083)
* The majority of the monitoring code is in instrumentation.c

* The new instrumentation bytecodes are in bytecodes.c

* legacy_tracing.c adapts the new API to the old sys.setrace and sys.setprofile APIs
2023-04-12 12:04:55 +01:00
gaogaotiantian
039714d00f
gh-101975: Fixed a potential SegFault on garbage collection (GH-102803) 2023-03-18 10:59:21 +00:00
Mark Shannon
233e32f936
GH-102300: Reuse objects with refcount == 1 in float specialized binary ops. (GH-102301) 2023-03-13 10:34:54 +00:00
Steve Dower
a99eb5cd99
gh-101907: Stop using _Py_OPCODE and _Py_OPARG macros (GH-101912)
* gh-101907: Removes use of non-standard C++ extension from Include/cpython/code.h

* Make cases_generator correct on Windows
2023-02-20 14:56:48 +00:00
Guido van Rossum
616aec1ff1
gh-98831: Modernize CALL and family (#101508)
Includes a slight improvement to `DECREF_INPUTS()`.
2023-02-08 11:40:10 -08:00
Guido van Rossum
1f0d0a432c
GH-98831: Move assorted macros from ceval.h to a new header (#101116) 2023-01-18 10:41:07 -08:00