DiffuMon

Basic Denoising Diffusion image generator implemented in PyTorch.

Reproduces Denoising Diffusion Probabilistic Models. DDIM sampling option also available.

Developed as an educational project, with the aim of having a simpler PyTorch implementation and development setup than other DDPM implementations available. Small and lean enough to train on a commodity GPU (in this case my Geforce 4070 Ti).

The basic idea is to train a model to learn how to denoise images. Images are generated by using this trained model to iteratively remove noise from a random noise image until a coherent image forms.

Two pretrained models are provided in the checkpoints/ directory for Fashion MNIST and a 11k Pokemon dataset.

Features

Reproducible environment with uv. Get setup with a single command.
Automatic dataset download and preprocessing for certain preloaded datasets.
Example notebook for sampling and gif generation.
Train on your own dataset by providing image files in a data.data_dir=/path/to/dataset directory.

Example Generations

Fashion MNIST sample generations

Pokemon 11k sample generations

NOTE: With small images, and high training epochs, the model likely overfits and gains the capability to memorize training samples

Denoising in action

Getting started

Setting up environment

This repo uses uv as the package/environment manager. Make sure to install it before proceeding.

Pretrained checkpoints are stored in Git LFS. Install git lfs before cloning to ensure checkpoint files are downloaded correctly.

The following command will install packages and setup a virtual environment

# Install packages
uv sync

## (Alternatively) Install all packages with added Nvidia CUDA support
uv sync --extras cuda


# Activate virtual enviornment

## Linux/Unix
. .venv/bin/activate

## Windows
. .venv/Scripts/activate

Access the entrypoints

Once installed, training and sampling are available as separate commands. Both use Hydra for configuration — override any config value with key=value syntax.

diffumon-train --help
diffumon-sample --help

Train a model

diffumon-train --help

Train a fashion MNIST model

diffumon-train data.preloaded=fashion_mnist num_epochs=15 learning_rate=0.001 checkpoint_path=checkpoints/fashion_mnist_15epochs.pth

Train a Pokemon Generative Model on the 11k Pokemon dataset (downscaled to 64x64 pixels)

diffumon-train data.preloaded=pokemon_11k num_epochs=448 learning_rate=0.001 data.img_dim=64 batch_size=64 checkpoint_path=checkpoints/pokemon_11k_448epochs_64dim.pth

Train a model on a dataset of your choice

diffumon-train data.preloaded=custom data.data_dir=/path/to/dataset num_epochs=15 learning_rate=0.001 checkpoint_path=checkpoints/my_dataset_15_epochs.pth

Where /path/to/dataset should have a directory structure like the following:

/path/to/dataset/
    train/
      class_0/
        img_0.png
        img_1.png
    test/
      class_0/
        img_0.png
        img_1.png

Generate samples

diffumon-sample --help

Generate samples from the trained fashion MNIST model

diffumon-sample checkpoint_path=checkpoints/fashion_mnist_100epochs.pth num_samples=32 output_dir=samples/fashion_mnist_100epochs

Generate samples from the trained Pokemon Generative Model

diffumon-sample checkpoint_path=checkpoints/pokemon_11k_448epochs_64dim.pth num_samples=32 output_dir=samples/pokemon_11k_448epochs_64dim_out1

Generate samples with DDIM Sampler

Use the sampler=ddim config group to switch to the deterministic DDIM sampler:

diffumon-sample \
  checkpoint_path=checkpoints/fashion_mnist_100epochs.pth \
  num_samples=16 \
  sampler=ddim \
  sampler.num_inference_steps=50 \
  output_dir=samples/fashion_mnist_ddim_50

Add a bit of stochasticity (non-zero eta) if you want more diverse outputs:

diffumon-sample \
  checkpoint_path=checkpoints/fashion_mnist_100epochs.pth \
  num_samples=16 \
  sampler=ddim \
  sampler.num_inference_steps=50 \
  sampler.eta=0.2 \
  output_dir=samples/fashion_mnist_ddim_eta02

Omitting sampler.num_inference_steps runs DDIM across the full training schedule.

Useful resources

Denoising Diffusion Probabilistic Models - The original paper by Ho et al. (2020)
- diffusion on github - The official codebase by the authors.
Improving Denoise Diffusion Probabilistic Models - Improved methodology by Nichol et al. (2021)
What are Diffusion Models - By Lilian Weng - Math heavy blog post explaining the concept.
Tutorial on Diffusion Models for Imaging and Vision - Tutorial by Stanley Chan which succinctly explains the math quite well.
The Annotated Diffusion - Basic tutorial for diffusion which goes off lucidrain's PyTorch implementation. This was the most utilized reference for this project!
lucidrains denoising-diffusion-pytorch - Ports Jonathan Ho's original code to PyTorch along with many of the original implementation's quirks. This was used as the primary code reference for this project.

Developer notes

black, ruff, isort, and pre-commit should come as preinstalled dev developer packages in the virtual environment.

It's strongly recommended to install pre-commit hooks to ensure code consistency and quality which will automatically run formatters (but not linters) before each commit.

pre-commit install

Jupyter notebooks

There are also example notebook(s) in the notebooks/ directory.

The ipykernel package is included in the uv dev dependencies. Install them with:

uv sync --dev

Inside notebooks you can switch samplers programmatically:

from diffumon.diffusion.sampler import SamplerType, p_sampler_to_images

samples = p_sampler_to_images(
    model=trained_model,
    ns=noise_schedule,
    num_samples=8,
    chw_dims=[1, 28, 28],
    sampler_type=SamplerType.DDIM,
    num_inference_steps=50,
    eta=0.0,
)

Future Goals

Add support for more preloaded datasets
Add smarter periodic checkpointing
Add logging
Improve learning rate scheduling
Add DDIM (Denoising Diffusion Implicit Models) support
Add Hydra-based configuration
Add Flow Matching Models
Make saved checkpoint loadable without CUDA

Name		Name	Last commit message	Last commit date
Latest commit History 160 Commits
checkpoints		checkpoints
notebooks		notebooks
src/diffumon		src/diffumon
static		static
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DiffuMon

Features

Example Generations

Denoising in action

Getting started

Setting up environment

Access the entrypoints

Train a model

Train a fashion MNIST model

Train a Pokemon Generative Model on the 11k Pokemon dataset (downscaled to 64x64 pixels)

Train a model on a dataset of your choice

Generate samples

Generate samples from the trained fashion MNIST model

Generate samples from the trained Pokemon Generative Model

Generate samples with DDIM Sampler

Useful resources

Developer notes

Jupyter notebooks

Future Goals

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DiffuMon

Features

Example Generations

Denoising in action

Getting started

Setting up environment

Access the entrypoints

Train a model

Train a fashion MNIST model

Train a Pokemon Generative Model on the 11k Pokemon dataset (downscaled to 64x64 pixels)

Train a model on a dataset of your choice

Generate samples

Generate samples from the trained fashion MNIST model

Generate samples from the trained Pokemon Generative Model

Generate samples with DDIM Sampler

Useful resources

Developer notes

Jupyter notebooks

Future Goals

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages