Mini-Inference Engine is a CUDA GEMM optimization learning project that packages progressive matrix multiplication kernels, a lightweight inference runtime, and profiling-oriented experimentation into one repository.
- CUDA kernels and runtime headers in
src/andinclude/ - Technical docs under
docs/ - Benchmarks and demos in
benchmarks/ - GitHub Pages site for documentation entry, reading paths, and project updates
cmake --preset default
cmake --build --preset default
ctest --preset defaultFor an optimized build without tests:
cmake --preset release
cmake --build --preset release
./build-release/benchmark- Project docs:
https://lessup.github.io/mini-inference-engine/ - Site home explains what to read first for architecture, optimization, and API details
- See
CONTRIBUTING.mdfor contribution workflow
MIT License