Spaces:

InternSVG
/

README

Running

App Files Files Community

README / README.md

KiyotakaWang

Update README.md

2b41305 verified 2 months ago

preview code

raw

history blame contribute delete

2.49 kB

	---
	title: README
	emoji: 🏃
	colorFrom: red
	colorTo: yellow
	sdk: static
	pinned: false
	---

	We are the InternSVG team from the Shanghai AI Laboratory, dedicated to empowering the InternVL series models with unified capabilities for SVG vector graphic understanding, editing, and generation.

	Current Work:

	## InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

	The InternSVG Family — a comprehensive suite that unifies data, benchmarks, and models for SVG understanding, editing, and generation. It consists of:

	🧩 SAgoge — the largest and most diverse multimodal SVG dataset, covering icons, illustrations, chemistry diagrams, and dynamic animations;

	🏆 SArena — a companion benchmark offering unified task definitions and standardized evaluation protocols across SVG domains;

	🤖 InternSVG Models — multimodal large language models trained for SVG understanding, editing, and generation.

	Project Links

	🌐 Project Page: https://hmwang2002.github.io/release/internsvg/

	📄 ArXiv Paper: https://arxiv.org/abs/2510.11341

	💻 GitHub Repository: https://github.com/hmwang2002/InternSVG

	📊 SArena Benchmark: https://huggingface.co/datasets/InternSVG/SArena

	🧩 SAgoge Dataset: https://huggingface.co/datasets/InternSVG/SAgoge

	🤖 InternSVG-8B Model: https://huggingface.co/InternSVG/InternSVG-8B

	## Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning

	In this work, we present CTRL-S (Chain-of-Thought Reinforcement Learning for SVG), a unified framework that introduces a chain-of-thought mechanism to explicitly expose the model’s reasoning process during SVG generation. To support this structured reasoning, we construct SVG-Sophia, a high-quality dataset of 145K samples across SVG code refinement, Text-to-SVG, and Image-to-SVG tasks. Furthermore, we design a robust multi-reward reinforcement learning scheme powered by the GRPO algorithm. By jointly optimizing across DINO, image-text similarity, format, and code efficiency rewards in a multi-task setting, our approach systematically boosts structural coherence and generation capabilities. Extensive experiments show that CTRL-S outperforms existing methods, achieving higher task success rates, superior code quality, and exceptional visual fidelity.

	📄 ArXiv Paper: https://arxiv.org/abs/2603.16189

	💻 GitHub Repository: https://github.com/hmwang2002/CTRL-S

	🧩 SVG-Sophia Dataset: https://huggingface.co/datasets/InternSVG/SVG-Sophia