BiliSakura/PixNerd-diffusers

Self-contained PixNerd-XL/16 checkpoints for Hugging Face diffusers. No external code repo is required β€” each subfolder ships its own pipeline.py, component modules, and weights.

This repo is derived from the development bundle in Visual-Generative-Foundation-Model-Collection, but inference only needs:

  • This model repo (BiliSakura/PixNerd-diffusers)
  • PyPI diffusers, torch, huggingface_hub

This Hugging Face repo hosts multiple self-contained checkpoints as subfolders. Each subfolder includes its own pipeline.py, model_index.json, weights, and component code (transformer/, scheduler/).

Available checkpoints

Subfolder Resolution Source checkpoint
PixNerd-XL-16-256/ 256Γ—256 epoch%3D319-step%3D1600000_emainit.ckpt
PixNerd-XL-16-512/ 512Γ—512 res512_ft200k_epoch%3D325-step%3D1800000_emainit.ckpt

Both checkpoints are ImageNet class-conditional PixNerd-XL/16 exports with flow-matching sampling.

Demo

PixNerd-XL-16-512 demo

Class 207 β€” golden retriever, 512Γ—512, 25 steps.

ImageNet class labels

Each variant keeps an English id2label map directly in its own model_index.json (DiT-style).

  • pipe.id2label β€” inspect id β†’ English label correspondence
  • pipe.labels β€” reverse maps (English synonym β†’ id), sorted for browsing
  • pipe.get_label_ids("golden retriever")
  • pipe(class_labels="golden retriever", ...) β€” string labels resolved automatically
  • pipe(prompt="golden retriever", ...) β€” deprecated alias for class_labels

Chinese labels are preserved in the main source repo under src/labels/id2label_cn.json for reference.

Load from Hugging Face

import torch
from diffusers import DiffusionPipeline

variant = "PixNerd-XL-16-256"  # or PixNerd-XL-16-512
resolution = 256 if variant.endswith("256") else 512

pipe = DiffusionPipeline.from_pretrained(
    f"BiliSakura/PixNerd-diffusers/{variant}",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")

# Scheduler defaults: timeshift=3.0, order=2 (see scheduler/scheduler_config.json)

images = pipe(
    class_labels="golden retriever",
    height=resolution,
    width=resolution,
    num_inference_steps=25,
    guidance_scale=4.0,
).images

print(pipe.id2label[207])          # "golden retriever"
pipe.get_label_ids("golden retriever")  # [207]
images = pipe(class_labels="golden retriever", height=resolution, width=resolution).images

Load from a local clone

import torch
from diffusers import DiffusionPipeline

repo = "models/BiliSakura/PixNerd-diffusers"
variant = "PixNerd-XL-16-256"

pipe = DiffusionPipeline.from_pretrained(
    f"{repo}/{variant}",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")

images = pipe(class_labels="golden retriever", height=256, width=256).images

Repo layout

BiliSakura/PixNerd-diffusers/
β”œβ”€β”€ README.md
β”œβ”€β”€ PixNerd-XL-16-256/
β”‚   β”œβ”€β”€ README.md
β”‚   β”œβ”€β”€ pipeline.py
β”‚   β”œβ”€β”€ model_index.json
β”‚   β”œβ”€β”€ conversion_metadata.json
β”‚   β”œβ”€β”€ transformer/
β”‚   └── scheduler/
└── PixNerd-XL-16-512/
    β”œβ”€β”€ README.md
    β”œβ”€β”€ pipeline.py
    β”œβ”€β”€ model_index.json
    β”œβ”€β”€ conversion_metadata.json
    β”œβ”€β”€ transformer/
    └── scheduler/

Interface notes

  • The pipeline uses class_labels for ImageNet class conditioning (prompt remains a deprecated alias).
  • Pass integer ImageNet ids (prompt=207) or human-readable synonyms (prompt="golden retriever").
  • height and width should match checkpoint intent (256 or 512), but custom sizes work if divisible by patch size (16).
  • Architecture and conversion provenance are recorded in each checkpoint's conversion_metadata.json.

Limitations

  • Intended for ImageNet class-conditional generation.
  • No text encoder is included.
  • Output quality depends on scheduler settings and inference step count.

Citation

Source paper (ICLR 2026):

Source code:

@article{2507.23268,
  Author = {Shuai Wang and Ziteng Gao and Chenhui Zhu and Weilin Huang and Limin Wang},
  Title = {PixNerd: Pixel Neural Field Diffusion},
  Year = {2025},
  Eprint = {arXiv:2507.23268},
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including BiliSakura/PixNerd-diffusers

Paper for BiliSakura/PixNerd-diffusers