JiT-diffusers

Native diffusers implementation of JiT (Just image Transformer). Each variant folder is self-contained:

  • pipeline.py β€” JiTPipeline
  • scheduler/scheduler_config.json β€” FlowMatchHeunDiscreteScheduler config (default shift=4.0)
  • transformer/jit_transformer_2d.py β€” JiTTransformer2DModel

The pipeline now supports dynamic inference resolution in __call__ with positional interpolation.

No separate jit_diffusers package; only PyPI diffusers plus local custom code in the variant directory.

Available checkpoints

Checkpoint Path Resolution Recommended CFG
JiT-B/16 ./JiT-B-16 256Γ—256 3.0
JiT-L/16 ./JiT-L-16 256Γ—256 2.4
JiT-H/16 ./JiT-H-16 256Γ—256 2.2
JiT-B/32 ./JiT-B-32 512Γ—512 3.0
JiT-L/32 ./JiT-L-32 512Γ—512 2.5
JiT-H/32 ./JiT-H-32 512Γ—512 2.3

ImageNet class labels

Each variant keeps an English id2label map directly in its own model_index.json (DiT-style).

  • pipe.id2label β€” inspect id β†’ English label correspondence
  • pipe.labels β€” reverse map (English synonym β†’ id), sorted for browsing
  • pipe.get_label_ids("golden retriever")
  • pipe(class_labels="golden retriever", ...) β€” string labels resolved automatically

Chinese labels are preserved in the main source repo under src/labels/id2label_cn.json for reference.

Inference

Run the bundled demo script from the repo root:

python demo_inference.py

This writes demo.png using JiT-H-32 with the settings below.

from pathlib import Path
from diffusers import DiffusionPipeline, FlowMatchHeunDiscreteScheduler
import torch

model_dir = Path("./JiT-H-32")
pipe = DiffusionPipeline.from_pretrained(
    str(model_dir),
    custom_pipeline=str(model_dir / "pipeline.py"),
    trust_remote_code=True,
)
pipe.scheduler = FlowMatchHeunDiscreteScheduler.from_config(pipe.scheduler.config, shift=4.0)
pipe.to("cuda")

# Numeric or human-readable labels
print(pipe.id2label[207])
print(pipe.get_label_ids("golden retriever"))

generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
    class_labels="golden retriever",
    num_inference_steps=50,
    guidance_scale=2.3,
    generator=generator,
).images[0]
image.save("demo.png")

height and width default to the checkpoint's native resolution when omitted.

Load a variant subfolder (e.g. ./JiT-H-32), not the repo root.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 1 Ask for provider support

Collection including BiliSakura/JiT-diffusers