Skip to content

PennLINC/ModelArrayIO

Repository files navigation

ModelArrayIO

Latest Version PyPI - Python Version License Documentation Status GitHub Actions: Tox Codecov Code style: ruff

ModelArrayIO is a Python package that converts between neuroimaging formats (fixel .mif, voxel NIfTI, CIFTI-2 dscalar) and the HDF5 (.h5) layout used by the R package ModelArray. It can also write ModelArray statistical results back to imaging formats.

Relationship to ConFixel: The earlier project ConFixel is superseded by ModelArrayIO. The ConFixel repository is retained for history (including links from publications) and will be archived; new work should use this repository.

Documentation for installation and usage: ModelArrayIO on GitHub (this README). For conda, HDF5 libraries, and installing the ModelArray R package, see the ModelArray vignette Installation.

Overview

ModelArrayIO provides three converter areas, each with import and export commands:

Once ModelArrayIO is installed, these commands are available in your terminal:

  • Fixel-wise data (MRtrix .mif):
    • .mif.h5: modelarrayio mif-to-h5
    • .h5.mif: modelarrayio h5-to-mif
  • Voxel-wise data (NIfTI):
    • NIfTI → .h5: modelarrayio nifti-to-h5
    • .h5 → NIfTI: modelarrayio h5-to-nifti
  • Greyordinate-wise data (CIFTI-2):
    • CIFTI-2 → .h5: modelarrayio cifti-to-h5
    • .h5 → CIFTI-2: modelarrayio h5-to-cifti

Storage backends: HDF5 and TileDB

ModelArrayIO supports two on-disk backends for the subject-by-element matrix:

  • HDF5 (default), implemented in modelarrayio/h5_storage.py
  • TileDB, implemented in modelarrayio/tiledb_storage.py

Both backends expose a similar API:

  • create a dense 2D array (subjects, items) and write all values at once
  • create an empty array with the same shape and write by column stripes
  • write/read column names alongside the data

Notes and minor differences:

  • Chunking vs tiling: HDF5 uses chunks; TileDB uses tiles. We compute tile sizes analogous to chunk sizes to keep write/read patterns similar.
  • Compression: HDF5 uses gzip by default; TileDB defaults to zstd with shuffle for better speed/ratio. You can switch to gzip for parity.
  • Metadata: HDF5 stores column_names as a dataset attribute; TileDB stores names as JSON metadata on the array/group.
  • Layout: Both backends keep dimensions in the same order and use zero-based indices.

About

Companion converter software for ModelArray for converting data back and forth from the HDF5 file format.

Resources

License

Stars

Watchers

Forks

Languages