Skip to content

GH-126910: Add gdb support for unwinding JIT frames#146071

Open
diegorusso wants to merge 6 commits intopython:mainfrom
diegorusso:add-gdb-support
Open

GH-126910: Add gdb support for unwinding JIT frames#146071
diegorusso wants to merge 6 commits intopython:mainfrom
diegorusso:add-gdb-support

Conversation

@diegorusso
Copy link
Contributor

@diegorusso diegorusso commented Mar 17, 2026

The PR adds the support to GDB for unwinding JIT frames by emitting eh frames.
It reuses part of the existent infrastructure for the perf_jit from @pablogsal.

This is part of the overall plan laid out here: #126910 (comment)

The output in GDB looks like:

Program received signal SIGINT, Interrupt.
0x0000fffff7fb50f8 in py::jit_executor:<jit> ()
(gdb) bt
#0  0x0000fffff7fb50f8 in py::jit_executor:<jit> ()
#1  0x0000fffff7fb4050 in py::jit_shim:<jit> ()
#2  0x0000aaaaaad5e314 in _PyEval_EvalFrameDefault (tstate=0xfffff7fb80f0, frame=0xfffff774bab0, throwflag=6, throwflag@entry=0)
    at ../../Python/generated_cases.c.h:5711
#3  0x0000aaaaaad61350 in _PyEval_EvalFrame (tstate=0xaaaaab1d57b0 <_PyRuntime+344632>, frame=0xfffff7fb8020, throwflag=0)
    at ../../Include/internal/pycore_ceval.h:122
...

@diegorusso diegorusso added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 18, 2026
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @diegorusso for commit ac018d6 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F146071%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 18, 2026
@pablogsal
Copy link
Member

I have some questions about the EH frame generation and how it applies to the different code regions.

Looking at jit_record_code, it's called in two places:

  1. For jit_shim (line 811): the entry shim compiled from Tools/jit/shim.c
  2. For jit_executor (line 757): the full executor code region (code_size + state.trampolines.size)

Both end up calling _PyJitUnwind_GdbRegisterCode, which builds the same EH frame via _PyJitUnwind_BuildEhFrame.

The EH frame in elf_init_ehframe describes a specific prologue/epilogue sequence. On x86_64 for example:

push %rbp          (1 byte)
mov %rsp, %rbp     (3 bytes)
call *%rcx         (2 bytes)
pop %rbp           (1 byte)
ret

I understand how this is correct for jit_shim. Looking at Tools/jit/shim.c, it's a normal C function that calls into the executor:

_Py_CODEUNIT *
_JIT_ENTRY(...) {
    jit_func_preserve_none jitted = (jit_func_preserve_none)exec->jit_code;
    return jitted(exec, frame, stack_pointer, tstate, ...);
}

The compiler will emit exactly the prologue/epilogue the EH frame describes.

But I don't understand how the same EH frame is correct for jit_executor. The executor code region is a concatenation of many stencils, each compiled from Tools/jit/template.c with __attribute__((preserve_none)), chaining together via __attribute__((musttail)) tail calls. These stencils don't have the push rbp / mov rsp,rbp prologue that the EH frame describes. They use a completely different calling convention.

The FDE covers the full code_size + trampolines.size range but the CFI instructions only describe ~7 bytes of prologue/epilogue. DWARF will apply the last rule (CFA = RSP + 8 on x86_64) to all remaining addresses in the range. I don't understand why that rule would be correct at arbitrary points within the stencil code. Is it guaranteed that preserve_none stencils never modify RSP? Or is there something else going on that makes this work?

The test (test_jit.py) sets a breakpoint at id(42) which hits in the interpreter, not in the middle of a stencil. So the test verifies that the symbols appear in GDB's backtrace, but I don't think it exercises unwinding from an arbitrary point within the executor code region. Could we add a test that triggers unwinding from inside JIT code (e.g., via a signal or Ctrl+C while executing JIT code)?

Am I missing something about how the stencils interact with the stack, or is the EH frame intentionally approximate for the executor region?

Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of questions I have from reading the code so far

struct jit_code_entry *first_entry;
};

static volatile struct jit_descriptor __jit_debug_descriptor = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be non-static? The GDB JIT interface spec says GDB locates __jit_debug_descriptor and __jit_debug_register_code by name in the symbol table. With static linkage they would be invisible in .dynsym on stripped builds and when CPython is loaded as a shared library via dlopen. Am I missing something, or would this silently break in release/packaged builds where .symtab is stripped?

Maybe also worth adding __attribute__((used)) to prevent the linker from eliding them?

id(42)
return

warming_up = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this loop hang? When warming_up=True, the call passes warming_up_caller=True which returns immediately at line 8, so the recursive body never actually executes. If the JIT does not activate via some other path, would this not spin forever until the timeout kills it? Should there be a max iteration count as a safety net?

Also, line 16 uses bitwise & instead of and. Was that intentional? It means is_active() is always evaluated even when is_enabled() is False.

return;
}
_PyJitUnwind_GdbRegisterCode(
code_addr, (unsigned int)code_size, entry, filename);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code_size comes in as size_t but gets cast to unsigned int here. I know JIT regions will not be 4GB, but should the API just take size_t throughout for consistency?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants