StatistEase — Neurosymbolic Statistical Analysis Assistant

Table of Contents

The Problem
The Solution
Features
Requirements
Quick Start
Architecture
License

The Problem

LLMs make statistical mistakes. They fabricate means, invent p-values, hallucinate confidence intervals, and present plausible-sounding nonsense as fact. We call these outputs mollocks — they look right, feel right, and are wrong.

StatistEase exists to stop this.

The Solution

StatistEase is a Kautz Type 1 neurosymbolic statistical analysis assistant:

Neural (LLM): Understands your question in natural language. Routes it to the correct statistical function. Explains the result in plain English.
Symbolic (Julia): Performs ALL mathematical computation. Every number comes from a verified, deterministic Julia function. Zero neural inference in the computation path.

You: "Is there a significant difference between these two groups?"
     │
     ▼ (Natural Language Understanding — neural)
LLM routes to: t_test_independent(group1, group2)
     │
     ▼ (Symbolic Computation — Julia)
Julia computes: t=2.847, df=38, p=0.007, Cohen's d=0.90
     │
     ▼ (Natural Language Generation — neural)
LLM explains: "Yes, there is a statistically significant difference
               (t(38)=2.847, p=.007) with a large effect size (d=0.90)."

Every number in that response came from Julia. The LLM touched none of them.

Important

HARD NOTICE — MOLLOCK WARNING

This software enforces a strict neural-symbolic boundary. No statistical value is ever produced by neural inference. If you see a number in a StatistEase response, it was computed by Julia. This is not a preference — it is a hard architectural invariant.

Features

Statistical Functions (17 Modules)

Module	Functions
Descriptive	Mean, median, mode, SD, skewness, kurtosis, quartiles, CI
Inferential	t-tests (independent, paired, one-sample), ANOVA, chi-square
Correlation & Regression	Pearson, Spearman, simple/multiple regression with VIF
Non-parametric	Mann-Whitney U, Wilcoxon signed-rank, Kruskal-Wallis, PERMANOVA
Effect Sizes	Cohen’s d, r, eta², Hedges' g, OR, NNT, CL effect size
Power Analysis	Power for t-tests, sample size for means/proportions/regression
Bayesian	Prior updating, Bayes factor (BIC), credible intervals (ETI + HDI)
Fuzzy Logic	Membership functions, fuzzy AND/OR/NOT, multi-rule inference
Dempster-Shafer	Evidence combination with conflict detection
Causality	Granger causality (regression-based F-test)
Estimation	James-Stein shrinkage estimator
Reliability	Cronbach’s alpha, McDonald’s omega
Validity	Content (Lawshe CVR), convergent/discriminant (AVE), criterion
Measurement	ICC (6 types), SEM, item analysis, sensitivity/specificity, PRE
Qualitative	Cohen’s/Fleiss' kappa, thematic saturation detection
Assumptions	Normality (Jarque-Bera), Levene’s test for homogeneity
Sampling	Design effect, margin of error with FPC, missing data analysis

Data Quality Pathway

Raw Input → Detection → Validation → Cleansing → Normalization → Analysis → Output

Detection: Automatic data type (nominal/ordinal/interval/ratio) and file format detection
Validation: Range checks, variance verification, infinity/NaN screening
Cleansing: Outlier detection (IQR/z-score/modified z-score), missing value handling, deduplication
Normalization: Z-score, min-max, log transforms; tabular normalization (1NF→3NF) checks

Output Formats

Unicode box-drawing tables (terminal)
ASCII histogram, box plot, scatter plot, bar chart
CSV and JSON export
Text reports with provenance stamps

Requirements

Julia 1.10+ with packages: Statistics, StatsBase, Distributions, DataFrames, CSV, JSON3, HTTP
LM Studio running locally (default: localhost:1234) with a model that supports function calling

Quick Start

cd statistease
julia --project=. -e 'using Pkg; Pkg.instantiate()'
julia --project=. -e 'using StatistEase; main()'

Or run without LLM (offline examples):

julia --project=. -e 'using StatistEase; run_examples()'

Architecture

This is a Kautz Type 1 neurosymbolic system — neural and symbolic components operate side-by-side with a defined, auditable interface boundary.

The boundary is src/tools/executor.jl:execute_tool(). Everything above it is neural (language understanding). Everything below it is symbolic (Julia computation). No statistical value crosses this boundary in the upward direction without having been computed by a verified Julia function.

License

PMPL-1.0-or-later (Palimpsest License)

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.devcontainer		.devcontainer
.github		.github
.hypatia		.hypatia
.machine_readable		.machine_readable
.reuse		.reuse
.well-known		.well-known
LICENSES		LICENSES
benches		benches
contractiles		contractiles
docs		docs
examples		examples
ffi/zig		ffi/zig
generated/abi		generated/abi
src		src
test		test
tests		tests
.clinerules		.clinerules
.cursorrules		.cursorrules
.editorconfig		.editorconfig
.envrc		.envrc
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.guix-channel		.guix-channel
.mailmap		.mailmap
.nojekyll		.nojekyll
.tool-versions		.tool-versions
.windsurfrules		.windsurfrules
0-AI-MANIFEST.a2ml		0-AI-MANIFEST.a2ml
ABI-FFI-README.md		ABI-FFI-README.md
CHANGELOG.md		CHANGELOG.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Containerfile		Containerfile
EXHIBIT-A-ETHICAL-USE.txt		EXHIBIT-A-ETHICAL-USE.txt
EXHIBIT-B-QUANTUM-SAFE.txt		EXHIBIT-B-QUANTUM-SAFE.txt
EXPLAINME.adoc		EXPLAINME.adoc
GOVERNANCE.md		GOVERNANCE.md
Justfile		Justfile
LICENSE		LICENSE
MAINTAINERS.adoc		MAINTAINERS.adoc
MAINTAINERS.md		MAINTAINERS.md
NOTICE		NOTICE
PLACEHOLDERS.md		PLACEHOLDERS.md
Project.toml		Project.toml
README.adoc		README.adoc
ROADMAP.adoc		ROADMAP.adoc
RSR_OUTLINE.adoc		RSR_OUTLINE.adoc
SECURITY.md		SECURITY.md
TOPOLOGY.md		TOPOLOGY.md
cliff.toml		cliff.toml
contractile.just		contractile.just
deny.toml		deny.toml
flake.nix		flake.nix
guix.scm		guix.scm
selur-compose.toml		selur-compose.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StatistEase — Neurosymbolic Statistical Analysis Assistant

The Problem

The Solution

Features

Statistical Functions (17 Modules)

Data Quality Pathway

Output Formats

Requirements

Quick Start

Architecture

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

StatistEase — Neurosymbolic Statistical Analysis Assistant

The Problem

The Solution

Features

Statistical Functions (17 Modules)

Data Quality Pathway

Output Formats

Requirements

Quick Start

Architecture

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages