PythonUnpackLLM is an automated reverse-engineering pipeline that reconstructs Python source code from compiled bytecode inside packaged executables.
It combines static bytecode disassembly with local LLM-assisted source reconstruction, designed specifically for:
- Malware analysis
- Red-team research
- Incident response
- Python packer forensics
Unlike experimental "LLM decompilers", PythonUnpackLLM focuses on stability, scale, and real-world RE workflows.
Reverse-engineering Python executables traditionally requires:
- Extracting
.pycfiles - Disassembling bytecode
- Manually reasoning about logic
- Dealing with latest Python version
This is slow and error-prone.
PythonUnpackLLM automates the full pipeline, using AI only at the interpretation stage — while keeping extraction and disassembly fully deterministic.
The LLM is treated as an untrusted analysis component, not a source of truth.
- Executable unpacking (PyInstaller detection + extraction)
- Recursive
.pycrecovery - Native bytecode disassembly (no AI / extra dependencies)
- Function boundary reconstruction
- LLM-assisted logic reconstruction
- Validation + structured output
python PythonUnpackLLM.py --path ./target.exe --unpackpython PythonUnpackLLM.py --path file.pyc --asmpython PythonUnpackLLM.py --path file.pycpython PythonUnpackLLM.py --path ./PYZ.pyz_extracted --type folder- Detects packaging type with auto-aborts unsupported formats (saves time in RE workflows)
- Built-in PyInstaller Extraction (Integrated pyinstxtractor-ng runner)
- Recursive Folder Mode
- Reconstructs functions from bytecode disassembly
- LLM output is treated as untrusted input. This makes the tool stable even when the model fails.
- Malware analysis
- Red team tool reversing
- IR investigations
| Capability | PythonUnpackLLM | uncompyle6 | decompyle3 | pycdc | pyinstxtractor-ng | ByteCodeLLM (original concept) |
|---|---|---|---|---|---|---|
| Purpose | Full automated RE pipeline | Python decompiler | Python decompiler | C++ Python decompiler | PyInstaller extractor | AI-assisted bytecode reasoning |
| Works on EXE directly | ✅ Yes (auto-unpack) | ❌ No | ❌ No | ❌ No | ⚠ Extract only | ❌ No |
| PyInstaller extraction | ✅ Built-in | ❌ | ❌ | ❌ | ✅ Yes | ❌ |
| Recursive folder processing | ✅ Yes | ❌ | ❌ | ❌ | ❌ | ❌ |
| Handles large sample sets | ✅ Designed for scale | ⚠ Manual workflow | ⚠ Manual workflow | ⚠ Manual workflow | ❌ Extraction only | ❌ Research prototype |
| Uses AI reconstruction | ✅ Local LLM | ❌ | ❌ | ❌ | ❌ | ✅ Yes |
| Deterministic bytecode analysis | ✅ Yes | ✅ | ✅ | ✅ | ❌ | ⚠ Partial |
| Trust model for AI output | ✅ Treated as untrusted | N/A | N/A | N/A | N/A | ❌ Not isolated |
| Function boundary reconstruction | ✅ Yes | ⚠ Partial | ⚠ Partial | ⚠ Partial | ❌ | ⚠ Experimental |
| Crash-safe pipeline | ✅ Yes | ❌ | ❌ | ❌ | ❌ | ❌ |
| Works on obfuscated malware samples | ✅ Designed for it | ⚠ Often fails | ⚠ Often fails | ⚠ Often fails | ❌ | ⚠ Experimental |
| Parallel processing | ✅ Yes | ❌ | ❌ | ❌ | ❌ | ❌ |
| Output is structured for analysis | ✅ Yes | ❌ Raw code | ❌ Raw code | ❌ Raw code | ❌ | ❌ |
Traditional Python reverse engineering tools focus only on decompilation.
PythonUnpackLLM focuses on end-to-end automation, combining deterministic bytecode analysis with AI-assisted interpretation - while maintaining reliability required for large-scale reverse engineering workflows.
- Original Research by CyberArk introducing the original ByteCodeLLM concept
- pyinstxtractor-ng project for PyInstaller extraction
This software is provided "as is", without warranty of any kind. This tool is intended for research, defensive security, and reverse-engineering education. Do not analyze software without legal authorization. The author assumes no responsibility for misuse.