Unconditional Image Generation
Diffusers
Safetensors
English
sit
image-generation
class-conditional
imagenet
Instructions to use BiliSakura/SiT-diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/SiT-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/SiT-diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
File size: 2,752 Bytes
4c42d10 ef8c044 4c42d10 ef8c044 4c42d10 de7c0d6 4c42d10 de7c0d6 ef8c044 4c42d10 de7c0d6 ef8c044 4c42d10 ef8c044 4c42d10 de7c0d6 4c42d10 de7c0d6 4c42d10 ef8c044 4c42d10 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | ---
library_name: diffusers
pipeline_tag: unconditional-image-generation
tags:
- diffusers
- sit
- image-generation
- class-conditional
- imagenet
license: mit
inference: true
widget:
- output:
url: SiT-XL-2-512/demo.png
language:
- en
---
# SiT-diffusers
Diffusers-ready checkpoints for **Scalable Interpolant Transformers (SiT)**, converted for local/offline use.
This root folder is a model collection that contains:
- `SiT-S-2-256`
- `SiT-B-2-256`
- `SiT-L-2-256`
- `SiT-XL-2-256`
- `SiT-XL-2-512`
Each subfolder is a self-contained Diffusers model repo with:
- `pipeline.py`
- `transformer/transformer_sit.py`
- `scheduler/scheduler_config.json` (`FlowMatchEulerDiscreteScheduler`)
- `transformer/diffusion_pytorch_model.safetensors`
- `vae/diffusion_pytorch_model.safetensors`
Each variant embeds English `id2label` directly in `model_index.json` (DiT-style), so class labels can be passed as
ImageNet ids or English synonym strings.
## Demo

Class-conditional sample (ImageNet class **207**, golden retriever), `SiT-XL/2` at 512×512, 250 steps, CFG 4.0, seed 0.
## Model Paths
Use paths relative to this root README:
| Model | Resolution | Local path |
| --- | ---: | --- |
| SiT-S/2 | 256x256 | `./SiT-S-2-256` |
| SiT-B/2 | 256x256 | `./SiT-B-2-256` |
| SiT-L/2 | 256x256 | `./SiT-L-2-256` |
| SiT-XL/2 | 256x256 | `./SiT-XL-2-256` |
| SiT-XL/2 | 512x512 | `./SiT-XL-2-512` |
## Inference Demo (Diffusers)
### 1) Load a local subfolder checkpoint
```python
import torch
from diffusers import DiffusionPipeline
model_path = "./SiT-XL-2-512" # change to any path in the table above
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe = DiffusionPipeline.from_pretrained(
model_path,
trust_remote_code=True,
).to(device)
generator = torch.Generator(device=device).manual_seed(0)
# ImageNet class example: 207 = golden retriever
print(pipe.id2label[207])
print(pipe.get_label_ids("golden retriever")) # [207]
result = pipe(
class_labels="golden retriever",
height=512,
width=512,
num_inference_steps=250, # official SiT comparisons commonly use 250 steps
guidance_scale=4.0,
generator=generator,
)
image = result.images[0]
image.save("sit_xl_512_demo.png")
```
### 2) Quick variant switch (256 models)
```python
model_path = "./SiT-S-2-256"
# model_path = "./SiT-B-2-256"
# model_path = "./SiT-L-2-256"
# model_path = "./SiT-XL-2-256"
pipe = DiffusionPipeline.from_pretrained(model_path, trust_remote_code=True).to(device)
image = pipe(
class_labels=207,
height=256,
width=256,
num_inference_steps=250,
guidance_scale=4.0,
generator=generator,
).images[0]
image.save("sit_256_demo.png")
```
|