Skip to content

ldayton/Parable

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

463 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

((((        ))))                              The wind blows where it will--
 ((((              ))))                          you hear its sound, but you
         ((((  P A R A B L E  ))))                don't know where it's from
   ))))         ((((                                    or where it's going.
))))        ((((                                                  β€” John 3:8

Parse bash exactly as bash does. One file, zero dependencies, in your language. This is the only complete bash parser for most languages. Extensively validated against bash itself.


Philosophy

LLM-driven development. This project is an exercise in maximizing what LLMs can do. An 11,000-line recursive descent parser for one of the gnarliest grammars in computing, plus a custom multi-target transpiler, built and maintained almost entirely through AI assistanceβ€”it wouldn't exist without them.

Match bash exactly. Bash is the oracle. We patched GNU Bash 5.3 so it reveals its internal parse tree, then test against it. No spec interpretation, no "close enough"β€”if bash parses it one way, so do we. Bash always tells the truth, even when it's lying.

Portable performance. Hand-written recursive descentβ€”no generators, no native extensions, no imports. Pure Python transpiles to other target languages. All run the same tests.

Transpiler

The transpiler supports these target languages:

Language Min Version Status
C GCC 13 Tests pass
C# .NET 8 Tests pass
Dart Dart 3.2 Test pass
Go Go 1.21 Tests pass
Java Temurin 21 Tests pass
Javascript Node.js 21 Tests pass
Lua Lua 5.4 Tests pass
Perl Perl 5.38 Tests pass
PHP PHP 8.3 Tests pass
Python CPython 3.12 Tests pass
Ruby Ruby 3.2 Tests pass
Typescript tsc 5.3 Tests pass
Swift Swift 5.9 WIP
Rust Rust 1.75 Future
Zig Zig 0.11 Future

Output code quality is a work in progress. Currently the transpiler prioritizes correctness over readability; generated code may not yet match hand-written idioms.

Why Parable?

Bash's grammar is notoriously irregular. Existing tools make tradeoffs:

  • bashlex β€” Incomplete. Fails on heredocs, arrays, arithmetic, and more. Fine for simple scripts, breaks on real ones.
  • Oils/OSH β€” A whole shell, not an embeddable library. Makes intentional parsing tradeoffs for a cleaner languageβ€”fine for their goals, but won't predict what real bash does.
  • tree-sitter-bash β€” Editor-focused, not Python-native. Many open parsing bugs.
  • mvdan/sh β€” Go-native, but doesn't fully match bash. Targets POSIX with bash extensions.
  • sh-syntax β€” WASM port of mvdan/sh, not pure JS. Inherits the same limitations.

Parable is the only library in these languages that parses bash exactly as bash doesβ€”tested against bash's own AST. For security and sandboxing, 95% coverage is 100% inadequate.

Use cases:

  • Security auditing β€” Analyze scripts for command injection, dangerous patterns, or policy violations. The construct you can't parse is the one that owns you.
  • CI/CD analysis β€” Understand what shell scripts actually do before running them.
  • Migration tooling β€” Convert bash to other languages with full AST access.
  • Linting and formatting β€” Build bash linters in Python & JS without regex hacks.

What It Handles 😱

The dark corners of bash that break other parsers:

# Nested everything
echo $(cat <(grep ${pattern:-".*"} "${files[@]}"))

# Heredoc inside command substitution inside heredoc
cat <<OUTER
$(cat <<INNER
$nested
INNER
)
OUTER

# Multiple heredocs on one line
diff <(cat <<A
one
A
) <(cat <<B
two
B
)

# Quoting transforms on array slices
printf '%q\n' "${arr[@]:2:5@Q}"

# Regex with expansions in conditional
[[ ${foo:-$(whoami)} =~ ^(user|${pattern})$ ]]

# Process substitution as redirect target
cmd > >(tee log.txt) 2> >(tee err.txt >&2)

# Extglob patterns that look like syntax
case $x in @(foo|bar|?(baz))) echo match;; esac

The full grammarβ€”parameter expansion, heredocs, process substitution, arithmetic, arrays, conditionals, coprocesses, all of it.

Security

Parable is designed for tools that need to predict what bash will do. Honest caveats:

  • Tested, not mathematically proven. We validate against bash's AST for thousands of difficult edge cases, but this is not a formal proof, verified by a proof checker. A determined attacker with capable LLMs may find discrepancies.
  • Validated against bash 5.3. Core parsing is stable across versions, but edge cases may differ. If your target runs ancient bash (macOS ships 3.2) or relies on version-specific quirks, verify independently.
  • Bash wasn't built for this. Even perfect parsing doesn't guarantee predictable execution. shopt settings, aliases, and runtime context all affect behavior. True security means containers or VMs.

Parable strives to be the best available tool for static bash analysisβ€”oracle-tested, not spec-interpreted. But for high-stakes security, nothing replaces defense in depth.

Test Coverage

Every test validated against real bash 5.3 ASTs.

  • GNU Bash test corpus: 19,370 lines
  • Oils bash corpus: 2,495 tests
  • tree-sitter-bash corpus: 125 tests
  • Parable hand-written tests: 1,900+ tests

Usage

from parable import parse

# Returns an AST, not string manipulation
ast = parse("ps aux | grep python | awk '{print $2}'")

# S-expression output for inspection
print(ast[0].to_sexp())
# (pipe (command (word "ps") (word "aux")) (pipe (command (word "grep") (word "python")) (command (word "awk") (word "'{print $2}'"))))

# Handles the weird stuff
ast = parse("cat <<'EOF'\nheredoc content\nEOF")
print(ast[0].to_sexp())
# (command (word "cat") (redirect "<<" "heredoc content\n"))

Project Structure

src/
└── parable.py                   # Single-file Python parser

transpiler/                      # Python β†’ multi-language transpiler
β”œβ”€β”€ src/frontend/                # Parser and type inference
β”œβ”€β”€ src/middleend/               # Analysis passes
└── src/backend/                 # Code generators
β”œβ”€β”€ src/ir.py                    # Intermediate representation

tests/
β”œβ”€β”€ bin/                         # Test runners + corpus utilities
β”œβ”€β”€ parable/                     # Parable test cases
└── corpus/                      # Validation corpus

tools/
└── fuzzer/                      # Differential fuzzers

dist/                            # Transpiled outputs

License

MIT

About

πŸŒ€ A complete bash parser that handles every edge case β€” extensively validated against bash itself β€” one file, zero dependencies

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages