BiliSakura/PixNerd-diffusers

Self-contained PixNerd-XL/16 checkpoints for Hugging Face diffusers. No external code repo is required — each subfolder ships its own pipeline.py, component modules, and weights.

This repo is derived from the development bundle in Visual-Generative-Foundation-Model-Collection, but inference only needs:

This model repo (BiliSakura/PixNerd-diffusers)
PyPI diffusers, torch, huggingface_hub

This Hugging Face repo hosts multiple self-contained checkpoints as subfolders. Each subfolder includes its own pipeline.py, model_index.json, weights, and component code (transformer/, scheduler/).

Available checkpoints

Subfolder	Resolution	Source checkpoint
`PixNerd-XL-16-256/`	256×256	`epoch%3D319-step%3D1600000_emainit.ckpt`
`PixNerd-XL-16-512/`	512×512	`res512_ft200k_epoch%3D325-step%3D1800000_emainit.ckpt`

Both checkpoints are ImageNet class-conditional PixNerd-XL/16 exports with flow-matching sampling.

Demo

Class 207 — golden retriever, 512×512, 25 steps.

ImageNet class labels

Each variant keeps an English id2label map directly in its own model_index.json (DiT-style).

pipe.id2label — inspect id → English label correspondence
pipe.labels — reverse maps (English synonym → id), sorted for browsing
pipe.get_label_ids("golden retriever")
pipe(class_labels="golden retriever", ...) — string labels resolved automatically
pipe(prompt="golden retriever", ...) — deprecated alias for class_labels

Chinese labels are preserved in the main source repo under src/labels/id2label_cn.json for reference.

Load from Hugging Face

import torch
from diffusers import DiffusionPipeline

variant = "PixNerd-XL-16-256"  # or PixNerd-XL-16-512
resolution = 256 if variant.endswith("256") else 512

pipe = DiffusionPipeline.from_pretrained(
    f"BiliSakura/PixNerd-diffusers/{variant}",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")

# Scheduler defaults: timeshift=3.0, order=2 (see scheduler/scheduler_config.json)

images = pipe(
    class_labels="golden retriever",
    height=resolution,
    width=resolution,
    num_inference_steps=25,
    guidance_scale=4.0,
).images

print(pipe.id2label[207])          # "golden retriever"
pipe.get_label_ids("golden retriever")  # [207]
images = pipe(class_labels="golden retriever", height=resolution, width=resolution).images

Load from a local clone

import torch
from diffusers import DiffusionPipeline

repo = "models/BiliSakura/PixNerd-diffusers"
variant = "PixNerd-XL-16-256"

pipe = DiffusionPipeline.from_pretrained(
    f"{repo}/{variant}",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
).to("cuda")

images = pipe(class_labels="golden retriever", height=256, width=256).images

Repo layout

BiliSakura/PixNerd-diffusers/
├── README.md
├── PixNerd-XL-16-256/
│   ├── README.md
│   ├── pipeline.py
│   ├── model_index.json
│   ├── conversion_metadata.json
│   ├── transformer/
│   └── scheduler/
└── PixNerd-XL-16-512/
    ├── README.md
    ├── pipeline.py
    ├── model_index.json
    ├── conversion_metadata.json
    ├── transformer/
    └── scheduler/

Interface notes

The pipeline uses class_labels for ImageNet class conditioning (prompt remains a deprecated alias).
Pass integer ImageNet ids (prompt=207) or human-readable synonyms (prompt="golden retriever").
height and width should match checkpoint intent (256 or 512), but custom sizes work if divisible by patch size (16).
Architecture and conversion provenance are recorded in each checkpoint's conversion_metadata.json.

Limitations

Intended for ImageNet class-conditional generation.
No text encoder is included.
Output quality depends on scheduler settings and inference step count.

Citation

Source paper (ICLR 2026):

Source code:

Original PixNerd codebase: MCG-NJU/PixNerd
Diffusers conversion code used for this export: Bili-Sakura/PixNerd-diffusers

@article{2507.23268,
  Author = {Shuai Wang and Ziteng Gao and Chenhui Zhu and Weilin Huang and Limin Wang},
  Title = {PixNerd: Pixel Neural Field Diffusion},
  Year = {2025},
  Eprint = {arXiv:2507.23268},
}

Downloads last month: -

Collection including BiliSakura/PixNerd-diffusers

Visual Generation Models

Collection

6 items • Updated 1 day ago • 1

Paper for BiliSakura/PixNerd-diffusers

PixNerd: Pixel Neural Field Diffusion

Paper • 2507.23268 • Published Jul 31, 2025 • 52