Skip to content

RootInj3c/PythonUnpackLLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 PythonUnpackLLM

AI-Powered Python Bytecode Reverse Engineering Framework

PythonUnpackLLM is an automated reverse-engineering pipeline that reconstructs Python source code from compiled bytecode inside packaged executables.

It combines static bytecode disassembly with local LLM-assisted source reconstruction, designed specifically for:

  • Malware analysis
  • Red-team research
  • Incident response
  • Python packer forensics

Unlike experimental "LLM decompilers", PythonUnpackLLM focuses on stability, scale, and real-world RE workflows.


Why This Tool Exists

Reverse-engineering Python executables traditionally requires:

  1. Extracting .pyc files
  2. Disassembling bytecode
  3. Manually reasoning about logic
  4. Dealing with latest Python version

This is slow and error-prone.

PythonUnpackLLM automates the full pipeline, using AI only at the interpretation stage — while keeping extraction and disassembly fully deterministic.

The LLM is treated as an untrusted analysis component, not a source of truth.


Pipeline Overview

  1. Executable unpacking (PyInstaller detection + extraction)
  2. Recursive .pyc recovery
  3. Native bytecode disassembly (no AI / extra dependencies)
  4. Function boundary reconstruction
  5. LLM-assisted logic reconstruction
  6. Validation + structured output

Usage

Extract PYC from exe

python PythonUnpackLLM.py --path ./target.exe --unpack

Disassemble a single file

python PythonUnpackLLM.py --path file.pyc --asm

Decompile a single file

python PythonUnpackLLM.py --path file.pyc

Decompile entire extracted tree

python PythonUnpackLLM.py --path ./PYZ.pyz_extracted --type folder

Key Features

  • Detects packaging type with auto-aborts unsupported formats (saves time in RE workflows)
  • Built-in PyInstaller Extraction (Integrated pyinstxtractor-ng runner)
  • Recursive Folder Mode
  • Reconstructs functions from bytecode disassembly
  • LLM output is treated as untrusted input. This makes the tool stable even when the model fails.

Use Cases

  • Malware analysis
  • Red team tool reversing
  • IR investigations

Tool Comparison

Capability PythonUnpackLLM uncompyle6 decompyle3 pycdc pyinstxtractor-ng ByteCodeLLM (original concept)
Purpose Full automated RE pipeline Python decompiler Python decompiler C++ Python decompiler PyInstaller extractor AI-assisted bytecode reasoning
Works on EXE directly ✅ Yes (auto-unpack) ❌ No ❌ No ❌ No ⚠ Extract only ❌ No
PyInstaller extraction ✅ Built-in ✅ Yes
Recursive folder processing ✅ Yes
Handles large sample sets ✅ Designed for scale ⚠ Manual workflow ⚠ Manual workflow ⚠ Manual workflow ❌ Extraction only ❌ Research prototype
Uses AI reconstruction ✅ Local LLM ✅ Yes
Deterministic bytecode analysis ✅ Yes ⚠ Partial
Trust model for AI output ✅ Treated as untrusted N/A N/A N/A N/A ❌ Not isolated
Function boundary reconstruction ✅ Yes ⚠ Partial ⚠ Partial ⚠ Partial ⚠ Experimental
Crash-safe pipeline ✅ Yes
Works on obfuscated malware samples ✅ Designed for it ⚠ Often fails ⚠ Often fails ⚠ Often fails ⚠ Experimental
Parallel processing ✅ Yes
Output is structured for analysis ✅ Yes ❌ Raw code ❌ Raw code ❌ Raw code

Traditional Python reverse engineering tools focus only on decompilation.
PythonUnpackLLM focuses on end-to-end automation, combining deterministic bytecode analysis with AI-assisted interpretation - while maintaining reliability required for large-scale reverse engineering workflows.

Credits & Acknowledgements

  • Original Research by CyberArk introducing the original ByteCodeLLM concept
  • pyinstxtractor-ng project for PyInstaller extraction

Disclaimer

This software is provided "as is", without warranty of any kind. This tool is intended for research, defensive security, and reverse-engineering education. Do not analyze software without legal authorization. The author assumes no responsibility for misuse.

About

The power of unpacking python executables with LLM!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages