How to use from the
Use from the
Wan2.2 library
# No code snippets available yet for this library.

# To use this model, check the repository files and the library's documentation.

# Want to help? PRs adding snippets are welcome at:
# https://github.com/huggingface/huggingface.js

wan2.2-i2v-a14b-diffusers-8bit

This repository contains mixed q8/BF16 MLX-Gen saved weights for Wan-AI/Wan2.2-I2V-A14B-Diffusers. It is designed for local Apple Silicon inference with mlx-gen.

It uses the mflux/MLX saved-weight layout with MLX quantization tensors. It is not a Diffusers or Transformers from_pretrained() checkpoint.

Source Model

Original model: Wan-AI/Wan2.2-I2V-A14B-Diffusers.

This quantized derivative follows the Apache 2.0 license of the source model.

Quantization

This is a mixed q8/BF16 checkpoint:

  • q8 for quantizable Wan transformer block attention and feed-forward linears.
  • BF16 for the Wan VAE.
  • BF16 for Wan transformer conditioning/output projection linears, the UMT5 text encoder, scheduler metadata, tokenizer files, norms, convolutions, and other non-quantizable parameters.

This mixed policy is used because fully quantizing sensitive Wan A14B paths produced invalid or low-quality video in local validation.

Validation

Measured on 2026-06-04 with mlx-gen 0.18.9 on Apple Silicon. The upstream Diffusers source snapshot measured about 118 GiB in the local Hugging Face cache before preparing these packages. The table below reports prepared-package generation from model init through MP4 save and post-save video-health validation.

Validation profile: public spacecraft source image, 384x384, 33 frames, 12 denoising steps, guidance 3.5, guidance-2 3.5, 8 fps, seed 4242, --low-ram.

Package Disk Full-Process Physical Peak Max RSS MLX Peak Total Time Video Health
BF16 package 64.1 GiB 33.7 GiB 31.8 GiB 28.2 GiB 228.2 s 33/33 frames, 384x384, 8 fps, temporal delta 10.4
This mixed q8/BF16 package 39.7 GiB 21.5 GiB 19.6 GiB 15.9 GiB 242.2 s 33/33 frames, 384x384, 8 fps, temporal delta 10.5

Compared with the BF16 prepared package at the same validation profile, this mixed q8/BF16 package reduces disk usage by about 38% and full-process physical peak memory by about 36%. Total time was about 6% slower in this run.

Physical peak is Darwin ri_phys_footprint sampled for the full process. The validation is intentionally small and repeatable; it is not a claim that every full-size 1280x720, 81-frame, 40-step job has the same memory or timing profile.

Usage

The included public sample image is available at examples/i2v_takeoff_source.png when this repository is cloned locally. For best I2V stability, use an input image whose aspect ratio matches the requested video dimensions and keep the subject inside the frame.

python -m pip install -U mlx-gen

mlxgen download --model AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit

mlxgen generate \
  --model AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit \
  --task image-to-video \
  --image path/to/input.png \
  --prompt "Cinematic image-to-video of the spacecraft lifting off from a snowy landing field, engines glowing, exhaust plume expanding, the full craft remains centered in frame." \
  --width 384 \
  --height 384 \
  --frames 33 \
  --steps 12 \
  --guidance 3.5 \
  --guidance-2 3.5 \
  --fps 8 \
  --seed 4242 \
  --low-ram \
  --metadata \
  --output video.mp4

Compatibility

Requires mlx-gen >= 0.18.9.

Generated with mlx-gen 0.18.9.

Use the mlxgen command and Python import path for new MLX-Gen projects.

Attribution

MLX-Gen is based on mflux by Filip Strand and the original mflux contributors.

Quantized and contributed by @lpalbou.

Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit

Finetuned
(18)
this model

Collection including AbstractFramework/wan2.2-i2v-a14b-diffusers-8bit