Heterogeneous Agent Collaborative Reinforcement Learning Paper • 2603.02604 • Published Mar 3 • 194
LFPO: Likelihood-Free Policy Optimization for Masked Diffusion Models Paper • 2603.01563 • Published Mar 2 • 2
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention Paper • 2510.04212 • Published Oct 5, 2025 • 26
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models Paper • 2601.07372 • Published Jan 12 • 47
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning Paper • 2601.05593 • Published Jan 9 • 86
Step-Audio-R1 Collection Step-Audio-R1 is the first audio language model to successfully unlock test-time compute scaling. • 4 items • Updated Jan 14 • 18
TiDAR: Think in Diffusion, Talk in Autoregression Paper • 2511.08923 • Published Nov 12, 2025 • 128
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published Nov 4, 2025 • 103
Matrix-Game: Interactive World Foundation Model Paper • 2506.18701 • Published Jun 23, 2025 • 72
OpenMathReasoning Collection Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated 5 days ago • 47
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published Apr 8, 2025 • 186