Skip to content

Add Latte: Latent Diffusion Transformer for Video Generation #7223

@kabachuha

Description

@kabachuha

Model/Pipeline/Scheduler description

Latte is a text2video diffusion transformer (similar to Sora), improving past the DiT and PixArt-alpha text2image models

The implementation is already based on diffusers (see latte_t2v.py), so adding it here should be a straightforward task

Open source status

  • The model implementation is available.
  • The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

The official repo https://github.com/Vchitect/Latte
Model on Huggingface: https://huggingface.co/maxin-cn/Latte
Paper: https://arxiv.org/abs/2401.03048v1
Project page: https://maxin-cn.github.io/latte_project/

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleIssues that haven't received updates

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions