Instructions to use BiliSakura/PixNerd-diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/PixNerd-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/PixNerd-diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
BiliSakura/PixNerd-diffusers
Self-contained PixNerd-XL/16 checkpoints for Hugging Face diffusers. No external code repo is required β each subfolder ships its own pipeline.py, component modules, and weights.
This repo is derived from the development bundle in Visual-Generative-Foundation-Model-Collection, but inference only needs:
- This model repo (
BiliSakura/PixNerd-diffusers) - PyPI
diffusers,torch,huggingface_hub
This Hugging Face repo hosts multiple self-contained checkpoints as subfolders. Each subfolder includes its own pipeline.py, model_index.json, weights, and component code (transformer/, scheduler/).
Available checkpoints
| Subfolder | Resolution | Source checkpoint |
|---|---|---|
PixNerd-XL-16-256/ |
256Γ256 | epoch%3D319-step%3D1600000_emainit.ckpt |
PixNerd-XL-16-512/ |
512Γ512 | res512_ft200k_epoch%3D325-step%3D1800000_emainit.ckpt |
Both checkpoints are ImageNet class-conditional PixNerd-XL/16 exports with flow-matching sampling.
Demo
Class 207 β golden retriever, 512Γ512, 25 steps.
ImageNet class labels
Each variant keeps an English id2label map directly in its own model_index.json (DiT-style).
pipe.id2labelβ inspect id β English label correspondencepipe.labelsβ reverse maps (English synonym β id), sorted for browsingpipe.get_label_ids("golden retriever")pipe(class_labels="golden retriever", ...)β string labels resolved automaticallypipe(prompt="golden retriever", ...)β deprecated alias forclass_labels
Chinese labels are preserved in the main source repo under src/labels/id2label_cn.json for reference.
Load from Hugging Face
import torch
from diffusers import DiffusionPipeline
variant = "PixNerd-XL-16-256" # or PixNerd-XL-16-512
resolution = 256 if variant.endswith("256") else 512
pipe = DiffusionPipeline.from_pretrained(
f"BiliSakura/PixNerd-diffusers/{variant}",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to("cuda")
# Scheduler defaults: timeshift=3.0, order=2 (see scheduler/scheduler_config.json)
images = pipe(
class_labels="golden retriever",
height=resolution,
width=resolution,
num_inference_steps=25,
guidance_scale=4.0,
).images
print(pipe.id2label[207]) # "golden retriever"
pipe.get_label_ids("golden retriever") # [207]
images = pipe(class_labels="golden retriever", height=resolution, width=resolution).images
Load from a local clone
import torch
from diffusers import DiffusionPipeline
repo = "models/BiliSakura/PixNerd-diffusers"
variant = "PixNerd-XL-16-256"
pipe = DiffusionPipeline.from_pretrained(
f"{repo}/{variant}",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to("cuda")
images = pipe(class_labels="golden retriever", height=256, width=256).images
Repo layout
BiliSakura/PixNerd-diffusers/
βββ README.md
βββ PixNerd-XL-16-256/
β βββ README.md
β βββ pipeline.py
β βββ model_index.json
β βββ conversion_metadata.json
β βββ transformer/
β βββ scheduler/
βββ PixNerd-XL-16-512/
βββ README.md
βββ pipeline.py
βββ model_index.json
βββ conversion_metadata.json
βββ transformer/
βββ scheduler/
Interface notes
- The pipeline uses
class_labelsfor ImageNet class conditioning (promptremains a deprecated alias). - Pass integer ImageNet ids (
prompt=207) or human-readable synonyms (prompt="golden retriever"). heightandwidthshould match checkpoint intent (256 or 512), but custom sizes work if divisible by patch size (16).- Architecture and conversion provenance are recorded in each checkpoint's
conversion_metadata.json.
Limitations
- Intended for ImageNet class-conditional generation.
- No text encoder is included.
- Output quality depends on scheduler settings and inference step count.
Citation
Source paper (ICLR 2026):
Source code:
- Original PixNerd codebase: MCG-NJU/PixNerd
- Diffusers conversion code used for this export: Bili-Sakura/PixNerd-diffusers
@article{2507.23268,
Author = {Shuai Wang and Ziteng Gao and Chenhui Zhu and Weilin Huang and Limin Wang},
Title = {PixNerd: Pixel Neural Field Diffusion},
Year = {2025},
Eprint = {arXiv:2507.23268},
}
- Downloads last month
- -
