| --- |
| title: CurvOpt SmarterModels |
| emoji: 📊 |
| colorFrom: red |
| colorTo: red |
| sdk: gradio |
| sdk_version: 6.6.0 |
| app_file: app.py |
| pinned: false |
| license: apache-2.0 |
| short_description: Smarter Models, Smaller Footprint |
| --- |
| # CurvOpt-LLM — Realtime Optimizer |
|
|
| **Curvature-guided mixed-precision optimization for LLMs. No retraining required.** |
|
|
| ## What This Does |
| - Loads any HuggingFace causal LM |
| - Computes Fisher diagonal curvature per layer (real gradients) |
| - Assigns FP32 / FP16 / BF16 per layer based on sensitivity |
| - Rewrites and saves a deployable optimized model (downloadable ZIP) |
| - Reports electricity, CO₂, and water footprint savings |
|
|
| ## How to Use |
| 1. Select a model from the dropdown (or enter a custom HF model ID) |
| 2. Set calibration samples (1–32) and PPL tolerance |
| 3. Click **Run Optimization** |
| 4. Download the optimized model ZIP when done |
|
|
| ## Supported Models |
| OPT family · GPT-2 family · Pythia · Phi · BLOOM · Mistral · Llama-2 · Qwen · Falcon · and any `AutoModelForCausalLM` compatible model. |
|
|
| ## Research |
| Based on Fisher Information / Optimal Brain Damage curvature analysis. |
| Novel contribution: per-request curvature-gated mixed precision with user intent feedback. |
|
|
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|
|
|
|
|