How to use from the
Use from the
Diffusers library
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image, export_to_video

# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("linyq/kiwi-edit-5b-instruct-reference-diffusers", dtype=torch.bfloat16, device_map="cuda")
pipe.to("cuda")

prompt = "A man with short gray hair plays a red electric guitar."
image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png"
)

output = pipe(image=image, prompt=prompt).frames[0]
export_to_video(output, "output.mp4")

Configuration Parsing Warning:In UNKNOWN_FILENAME: "diffusers._class_name" must be a string

Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance

Kiwi-Edit is a versatile video editing framework built on an MLLM encoder and a video Diffusion Transformer (DiT). It supports both instruction-based video editing and reference-guided editing (using a reference image and instruction).

Model Description

Kiwi-Edit introduces a unified editing architecture that synergizes learnable queries and latent visual features for reference semantic guidance. It addresses the challenge of precise visual control in instruction-based editing by allowing users to provide a reference image to guide the transformation. The framework achieves significant performance improvements in instruction following and reference fidelity through a scalable data generation pipeline and a multi-stage training curriculum.

Usage

This model is compatible with the diffusers library. To run inference, follow the installation instructions in the official repository.

Quick Test with Diffusers

You can run a quick test on a demo video using the following command provided in the repository:

python diffusers_demo.py \
    --video_path ./demo_data/video/source/0005e4ad9f49814db1d3f2296b911abf.mp4 \
    --prompt "Remove the monkey." \
    --save_path output.mp4 \
    --model_path linyq/kiwi-edit-5b-instruct-only-diffusers

Citation

If you find this work useful, please cite:

@misc{kiwiedit,
      title={Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance}, 
      author={Yiqi Lin and Guoqiang Liang and Ziyun Zeng and Zechen Bai and Yanzhe Chen and Mike Zheng Shou},
      year={2026},
      eprint={2603.02175},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.02175}, 
}
Downloads last month
225
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Spaces using linyq/kiwi-edit-5b-instruct-reference-diffusers 2

Collection including linyq/kiwi-edit-5b-instruct-reference-diffusers

Paper for linyq/kiwi-edit-5b-instruct-reference-diffusers