Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- vector-quantization
|
| 5 |
+
- image-tokenizer
|
| 6 |
+
- codebook-regularization
|
| 7 |
+
- icml2026
|
| 8 |
+
datasets:
|
| 9 |
+
- imagenet-1k
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# DimVQ: Unveiling And Addressing Dimensional Collapse In Vector Quantization Models Via Codebook Regularization
|
| 13 |
+
|
| 14 |
+
Official pre-trained checkpoints for the **ICML 2026** paper.
|
| 15 |
+
|
| 16 |
+
## Model Description
|
| 17 |
+
|
| 18 |
+
DimVQ identifies **dimensional collapse** in vector quantization models and proposes a simple **codebook regularization** to restore suppressed low-variance components. This regularization bridges the spectral gap between discrete codebook spaces and continuous representations.
|
| 19 |
+
|
| 20 |
+
## Available Checkpoints
|
| 21 |
+
|
| 22 |
+
| File | Model | Resolution | Codebook Size (K) | Embedding Dim (D) |
|
| 23 |
+
|------|-------|-----------|-------------------|-------------------|
|
| 24 |
+
| `simvq_K65536/65536.ckpt` | SimVQ + Codebook Reg. | 128x128 | 65,536 | 128 |
|
| 25 |
+
| `simvq_K65536/65536.yaml` | Config for above | - | - | - |
|
| 26 |
+
| `simvq_K262144/262144.ckpt` | SimVQ + Codebook Reg. | 128x128 | 262,144 | 128 |
|
| 27 |
+
| `simvq_K262144/262144.yaml` | Config for above | - | - | - |
|
| 28 |
+
|
| 29 |
+
## Usage
|
| 30 |
+
|
| 31 |
+
```python
|
| 32 |
+
# Load checkpoint
|
| 33 |
+
import torch
|
| 34 |
+
checkpoint = torch.load("262144.ckpt", map_location="cpu")
|
| 35 |
+
model.load_state_dict(checkpoint["state_dict"])
|
| 36 |
+
```
|
| 37 |
+
|
| 38 |
+
## TODO
|
| 39 |
+
|
| 40 |
+
- [ ] IBQ checkpoints (K=16384, K=262144, 256x256)
|
| 41 |
+
- [ ] Downstream autoregressive generation models (IBQ-B, IBQ-L, IBQ-XXL)
|
| 42 |
+
|
| 43 |
+
## Citation
|
| 44 |
+
|
| 45 |
+
```bibtex
|
| 46 |
+
@inproceedings{zhang2026dimvq,
|
| 47 |
+
title={Unveiling And Addressing Dimensional Collapse In Vector Quantization Models Via Codebook Regularization},
|
| 48 |
+
author={Zhang, Fang and Zhu, Yongxin and Liu, Yihao and Fu, Bin and Xu, Linli},
|
| 49 |
+
booktitle={International Conference on Machine Learning (ICML)},
|
| 50 |
+
year={2026}
|
| 51 |
+
}
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
## Links
|
| 55 |
+
|
| 56 |
+
- [Paper (arXiv)](https://arxiv.org/abs/TODO)
|
| 57 |
+
- [Code (GitHub)](https://github.com/ksblk2116/dimvq)
|