MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences Paper • 2601.06789 • Published 5 days ago • 73
Controlled Self-Evolution for Algorithmic Code Optimization Paper • 2601.07348 • Published 3 days ago • 94
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests Paper • 2601.06953 • Published 4 days ago • 39
Lost in the Noise: How Reasoning Models Fail with Contextual Distractors Paper • 2601.07226 • Published 4 days ago • 25
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning Paper • 2601.05593 • Published 7 days ago • 72
One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling Paper • 2601.03111 • Published 9 days ago • 8
Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models Paper • 2512.21815 • Published 21 days ago • 20
DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation Paper • 2601.04823 • Published 7 days ago • 5
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards Paper • 2601.06021 • Published 6 days ago • 38
Can We Predict Before Executing Machine Learning Agents? Paper • 2601.05930 • Published 6 days ago • 25
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning Paper • 2601.06002 • Published 6 days ago • 48
MMFormalizer: Multimodal Autoformalization in the Wild Paper • 2601.03017 • Published 9 days ago • 101
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 7 days ago • 185
KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs Paper • 2601.01046 • Published 13 days ago • 11
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits Paper • 2512.20578 • Published 23 days ago • 75
E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models Paper • 2601.00423 • Published 14 days ago • 8
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published 10 days ago • 95