Instructions to use WaveCut/ideogram-4-sdnq-uint4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use WaveCut/ideogram-4-sdnq-uint4 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("WaveCut/ideogram-4-sdnq-uint4", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("WaveCut/ideogram-4-sdnq-uint4", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]Ideogram 4 FP8 -> SDNQ UInt4
This is an experimental SDNQ UInt4 conversion of ideogram-ai/ideogram-4-fp8. It is intended for local research and non-commercial use under the upstream Ideogram 4 license. The conversion was made from the FP8 checkpoint, materializing FP8 linears back to bf16 and then applying static SDNQ uint4 component-by-component.
The model includes SDNQ-compressed text_encoder, transformer, unconditional_transformer, and vae components. The official ideogram4 loader does not know how to instantiate SDNQ-packed custom transformers, so this repository includes ideogram4_sdnq_pipeline.py.
Usage
import torch
from ideogram4 import PRESETS
from ideogram4_sdnq_pipeline import Ideogram4SDNQPipeline
pipe = Ideogram4SDNQPipeline.from_pretrained(
"WaveCut/ideogram-4-sdnq-uint4",
device="cuda",
dtype=torch.bfloat16,
)
preset = PRESETS["V4_DEFAULT_20"]
image = pipe(
"a typographic poster reading HELLO WORLD",
height=1024,
width=1024,
num_steps=preset.num_steps,
guidance_schedule=preset.guidance_schedule,
mu=preset.mu,
std=preset.std,
seed=4101,
raise_on_caption_issues=False,
)[0]
image.save("out.png")
Install requirements:
pip install git+https://github.com/ideogram-oss/ideogram4 sdnq safetensors transformers accelerate pillow
Component Structure
Upstream FP8 structure:
text_encoder: Qwen3-VL text path used in text-only mode. Hidden states from 13 layers are concatenated for the DiT.transformer: conditional 34-layer single-stream DiT.unconditional_transformer: image-only negative branch used for asymmetric CFG.vae: Flux2-style KL autoencoder decoder.tokenizerandscheduler: copied from upstream.
Quantization
| Component | Source materialized MB | SDNQ state MB | Quantize s | Quant peak nvidia MB |
|---|---|---|---|---|
| transformer | 17698.84 | 4979.66 | 112.64 | 36525.00 |
| unconditional_transformer | 17698.84 | 4979.66 | 108.68 | 36525.00 |
| text_encoder | 14435.59 | 4097.53 | 102.32 | 24477.00 |
| vae | 160.31 | 50.19 | 2.68 | 861.00 |
Benchmark
Hardware: RunPod NVIDIA RTX PRO 6000 Blackwell Server Edition, single process, concurrency 1. Generation used 10 structured JSON prompts at 1024x1024 with V4_DEFAULT_20.
The FP8 baseline was loaded through the upstream ideogram4 Ideogram4Pipeline.from_pretrained recipe with weights_repo="ideogram-ai/ideogram-4-fp8"; magic-prompt expansion was disabled because the prompts are already structured captions.
| Variant | Load s | Load peak reserved MB | Load peak nvidia MB | Cold request s | Hot mean s | Gen peak reserved MB | Gen peak nvidia MB |
|---|---|---|---|---|---|---|---|
| original | 267.83 | 28198.00 | 28759.00 | 17.90 | 17.51 | 34430.00 | 35099.00 |
| sdnq | 239.46 | 14558.00 | 15109.00 | 18.56 | 16.52 | 21650.00 | 22321.00 |
Example Matrix
The matrix below keeps the original FP8 and SDNQ UInt4 outputs side by side in narrow vertical columns. It is a WebP at quality 95.
Prompt Set
| # | id | summary |
|---|---|---|
| 1 | editorial_watch_photo |
A photorealistic editorial product photograph of a transparent mechanical wristwatch resting on a wet black stone slab, with tiny engraved labels visible on the dial. |
| 2 | risograph_botanical_poster |
A layered risograph botanical exhibition poster with bold overprint textures and clean typographic hierarchy. |
| 3 | cyrillic_cafe_menu |
A cozy Moscow cafe menu board photographed straight-on, testing clean Cyrillic typography in chalk and printed labels. |
| 4 | brutalist_architecture |
A cinematic architectural photograph of a brutalist library atrium with tiny wayfinding signs and people for scale. |
| 5 | ink_manga_rain |
A dramatic black-and-white manga splash page of a courier cycling through rain, with sound effects and shop signage. |
| 6 | museum_clay_render |
A polished 3D clay render of a museum diorama showing a future Arctic research station with labeled miniature modules. |
| 7 | food_packaging_label |
A realistic premium chocolate bar packaging mockup with layered foil, embossed typography, and ingredient microcopy. |
| 8 | fantasy_map_typography |
A hand-painted fantasy map on parchment with readable place names, compass ornament, and coastal illustrations. |
| 9 | streetwear_lookbook |
A fashion lookbook cover photograph for a streetwear collection, with crisp cover typography and realistic fabric textures. |
| 10 | scientific_cutaway |
A detailed scientific cutaway illustration of a compact fusion battery prototype with annotated parts and clean technical typography. |
Files
prompts.json: the 10 structured prompts used for the comparison.assets/original_vs_sdnq_vertical.webp: vertical side-by-side WebP comparison matrix for original FP8 vs SDNQ UInt4, quality 95.assets/sdnq_vs_nf4_4090_vertical.webp: vertical side-by-side WebP comparison matrix for the RTX 4090 SDNQ vs official NF4 follow-up, quality 95.benchmark/: raw benchmark JSONL/CSV files andsummary.json.quantization_manifest.json: component-level quantization timings, storage, and VRAM peaks.ideogram4_sdnq_pipeline.py: loader helper for the SDNQ custom transformer components.
RTX 4090 Follow-up: SDNQ UInt4 vs Official NF4
Hardware: RunPod NVIDIA GeForce RTX 4090, 24 GB VRAM, single process, concurrency 1. Both variants used the same 10 structured captions from prompts.json, 1024x1024, V4_DEFAULT_20, and no magic-prompt expansion. nf4 uses the official ideogram-ai/ideogram-4-nf4 checkpoint through the upstream ideogram4 loader.
| Variant | Cases | Load s | Load peak reserved MB | Load peak nvidia MB | Cold request s | Hot mean s | Hot max s | Gen peak reserved MB | Gen peak nvidia MB |
|---|---|---|---|---|---|---|---|---|---|
| sdnq | 10.00 | 211.61 | 14124.00 | 14466.00 | 59.65 | 37.05 | 37.57 | 19768.00 | 20521.00 |
| nf4 | 10.00 | 269.31 | 15370.00 | 15766.00 | 36.57 | 36.31 | 36.77 | 21012.00 | 21801.00 |
Raw follow-up metrics are in benchmark/summary_4090_sdnq_vs_nf4.json, benchmark/sdnq_4090_metrics.*, and benchmark/nf4_4090_metrics.*. The exact runner used for the follow-up is benchmark/followup_runner.py.
License
This checkpoint is derived from ideogram-ai/ideogram-4-fp8 and follows the upstream Ideogram 4 non-commercial license. See LICENSE.md.
- Downloads last month
- -
Model tree for WaveCut/ideogram-4-sdnq-uint4
Base model
ideogram-ai/ideogram-4-fp8
