| title: README | |
| emoji: π | |
| colorFrom: red | |
| colorTo: yellow | |
| sdk: static | |
| pinned: false | |
| We are the InternSVG team from the Shanghai AI Laboratory, dedicated to empowering the InternVL series models with unified capabilities for SVG vector graphic understanding, editing, and generation. | |
| Current Work: | |
| ## InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models | |
| The InternSVG Family β a comprehensive suite that unifies data, benchmarks, and models for SVG understanding, editing, and generation. It consists of: | |
| π§© SAgoge β the largest and most diverse multimodal SVG dataset, covering icons, illustrations, chemistry diagrams, and dynamic animations; | |
| π SArena β a companion benchmark offering unified task definitions and standardized evaluation protocols across SVG domains; | |
| π€ InternSVG Models β multimodal large language models trained for SVG understanding, editing, and generation. | |
| Project Links | |
| π Project Page: https://hmwang2002.github.io/release/internsvg/ | |
| π ArXiv Paper: https://arxiv.org/abs/2510.11341 | |
| π» GitHub Repository: https://github.com/hmwang2002/InternSVG | |
| π SArena Benchmark: https://huggingface.co/datasets/InternSVG/SArena | |
| π§© SAgoge Dataset: https://huggingface.co/datasets/InternSVG/SAgoge | |
| π€ InternSVG-8B Model: https://huggingface.co/InternSVG/InternSVG-8B | |
| ## Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning | |
| In this work, we present CTRL-S (Chain-of-Thought Reinforcement Learning for SVG), a unified framework that introduces a chain-of-thought mechanism to explicitly expose the modelβs reasoning process during SVG generation. To support this structured reasoning, we construct SVG-Sophia, a high-quality dataset of 145K samples across SVG code refinement, Text-to-SVG, and Image-to-SVG tasks. Furthermore, we design a robust multi-reward reinforcement learning scheme powered by the GRPO algorithm. By jointly optimizing across DINO, image-text similarity, format, and code efficiency rewards in a multi-task setting, our approach systematically boosts structural coherence and generation capabilities. Extensive experiments show that CTRL-S outperforms existing methods, achieving higher task success rates, superior code quality, and exceptional visual fidelity. | |
| π ArXiv Paper: https://arxiv.org/abs/2603.16189 | |
| π» GitHub Repository: https://github.com/hmwang2002/CTRL-S | |
| π§© SVG-Sophia Dataset: https://huggingface.co/datasets/InternSVG/SVG-Sophia | |