wan2.2-i2v-a14b-diffusers-8bit

This repository contains mixed q8/BF16 MLX-Gen saved weights for Wan-AI/Wan2.2-I2V-A14B-Diffusers. It is designed for local Apple Silicon inference with mlx-gen.

It uses the mflux/MLX saved-weight layout with MLX quantization tensors. It is not a Diffusers or Transformers from_pretrained() checkpoint.

Source Model

Original model: Wan-AI/Wan2.2-I2V-A14B-Diffusers.

This quantized derivative follows the Apache 2.0 license of the source model.

Quantization

This is a mixed q8/BF16 checkpoint:

q8 for quantizable Wan transformer block attention and feed-forward linears.
BF16 for the Wan VAE.
BF16 for Wan transformer conditioning/output projection linears, the UMT5 text encoder, scheduler metadata, tokenizer files, norms, convolutions, and other non-quantizable parameters.

This mixed policy is used because fully quantizing sensitive Wan A14B paths produced invalid or low-quality video in local validation.

Validation

Measured on 2026-06-04 with mlx-gen 0.18.9 on Apple Silicon. The upstream Diffusers source snapshot measured about 118 GiB in the local Hugging Face cache before preparing these packages. The table below reports prepared-package generation from model init through MP4 save and post-save video-health validation.

Validation profile: public spacecraft source image, 384x384, 33 frames, 12 denoising steps, guidance 3.5, guidance-2 3.5, 8 fps, seed 4242, --low-ram.

Package	Disk	Full-Process Physical Peak	Max RSS	MLX Peak	Total Time	Video Health
BF16 package	64.1 GiB	33.7 GiB	31.8 GiB	28.2 GiB	228.2 s	33/33 frames, 384x384, 8 fps, temporal delta 10.4
This mixed q8/BF16 package	39.7 GiB	21.5 GiB	19.6 GiB	15.9 GiB	242.2 s	33/33 frames, 384x384, 8 fps, temporal delta 10.5

Compared with the BF16 prepared package at the same validation profile, this mixed q8/BF16 package reduces disk usage by about 38% and full-process physical peak memory by about 36%. Total time was about 6% slower in this run.

Physical peak is Darwin ri_phys_footprint sampled for the full process. The validation is intentionally small and repeatable; it is not a claim that every full-size 1280x720, 81-frame, 40-step job has the same memory or timing profile.

Usage

The included public sample image is available at examples/i2v_takeoff_source.png when this repository is cloned locally. For best I2V stability, use an input image whose aspect ratio matches the requested video dimensions and keep the subject inside the frame.

python -m pip install -U mlx-gen

mlxgen download --model AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit

mlxgen generate \
  --model AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit \
  --task image-to-video \
  --image path/to/input.png \
  --prompt "Cinematic image-to-video of the spacecraft lifting off from a snowy landing field, engines glowing, exhaust plume expanding, the full craft remains centered in frame." \
  --width 384 \
  --height 384 \
  --frames 33 \
  --steps 12 \
  --guidance 3.5 \
  --guidance-2 3.5 \
  --fps 8 \
  --seed 4242 \
  --low-ram \
  --metadata \
  --output video.mp4

Compatibility

Requires mlx-gen >= 0.18.9.

Generated with mlx-gen 0.18.9.

Use the mlxgen command and Python import path for new MLX-Gen projects.

Attribution

MLX-Gen is based on mflux by Filip Strand and the original mflux contributors.

Quantized and contributed by @lpalbou.

Downloads last month: -; Downloads are not tracked for this model. How to track

MLX

Hardware compatibility

8-bit

Model tree for AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit

Base model

Wan-AI/Wan2.2-I2V-A14B-Diffusers

Finetuned

(18)

this model

Collection including AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit

mlx-gen

Collection

Models prepared and quantized for Apple MLX by mlx-gen based on mflux. https://github.com/lpalbou/mlx-gen • 28 items • Updated about 14 hours ago