gh-130861: Add clarification to the perf docs on optimization levels (#131098)
This commit is contained in:
parent
b9f0943c1e
commit
d134bd272f
@ -254,13 +254,28 @@ files in the current directory which are ELF images for all the JIT trampolines
|
||||
that were created by Python.
|
||||
|
||||
.. warning::
|
||||
Notice that when using ``--call-graph dwarf`` the ``perf`` tool will take
|
||||
When using ``--call-graph dwarf``, the ``perf`` tool will take
|
||||
snapshots of the stack of the process being profiled and save the
|
||||
information in the ``perf.data`` file. By default the size of the stack dump
|
||||
is 8192 bytes but the user can change the size by passing the size after
|
||||
comma like ``--call-graph dwarf,4096``. The size of the stack dump is
|
||||
important because if the size is too small ``perf`` will not be able to
|
||||
unwind the stack and the output will be incomplete. On the other hand, if
|
||||
the size is too big, then ``perf`` won't be able to sample the process as
|
||||
frequently as it would like as the overhead will be higher.
|
||||
information in the ``perf.data`` file. By default, the size of the stack dump
|
||||
is 8192 bytes, but you can change the size by passing it after
|
||||
a comma like ``--call-graph dwarf,16384``.
|
||||
|
||||
The size of the stack dump is important because if the size is too small
|
||||
``perf`` will not be able to unwind the stack and the output will be
|
||||
incomplete. On the other hand, if the size is too big, then ``perf`` won't
|
||||
be able to sample the process as frequently as it would like as the overhead
|
||||
will be higher.
|
||||
|
||||
The stack size is particularly important when profiling Python code compiled
|
||||
with low optimization levels (like ``-O0``), as these builds tend to have
|
||||
larger stack frames. If you are compiling Python with ``-O0`` and not seeing
|
||||
Python functions in your profiling output, try increasing the stack dump
|
||||
size to 65528 bytes (the maximum)::
|
||||
|
||||
$ perf record -F 9999 -g -k 1 --call-graph dwarf,65528 -o perf.data python -Xperf_jit my_script.py
|
||||
|
||||
Different compilation flags can significantly impact stack sizes:
|
||||
|
||||
- Builds with ``-O0`` typically have much larger stack frames than those with ``-O1`` or higher
|
||||
- Adding optimizations (``-O1``, ``-O2``, etc.) typically reduces stack size
|
||||
- Frame pointers (``-fno-omit-frame-pointer``) generally provide more reliable stack unwinding
|
||||
|
Loading…
x
Reference in New Issue
Block a user