FABLE: Forest-Based Adaptive Bi-Path LLM-Enhanced Retrieval for Multi-Document Reasoning Paper • 2601.18116 • Published 2 days ago • 9
TriPlay-RL: Tri-Role Self-Play Reinforcement Learning for LLM Safety Alignment Paper • 2601.18292 • Published 2 days ago • 9
Router Upcycling: Leveraging Mixture-of-Routers in Mixture-of-Experts Upcycling Paper • 2509.00679 • Published Aug 31, 2025
TextlessRAG: End-to-End Visual Document RAG by Speech Without Text Paper • 2509.07538 • Published Sep 9, 2025
FABLE: Forest-Based Adaptive Bi-Path LLM-Enhanced Retrieval for Multi-Document Reasoning Paper • 2601.18116 • Published 2 days ago • 9
TriPlay-RL: Tri-Role Self-Play Reinforcement Learning for LLM Safety Alignment Paper • 2601.18292 • Published 2 days ago • 9
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14, 2025 • 186
Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training Paper • 2508.14904 • Published Aug 12, 2025 • 2
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Paper • 2503.04872 • Published Mar 6, 2025 • 15
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper • 2506.04734 • Published Jun 5, 2025 • 21
Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance Paper • 2502.12459 • Published Feb 18, 2025 • 3
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper • 2506.04734 • Published Jun 5, 2025 • 21
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper • 2506.04734 • Published Jun 5, 2025 • 21
Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance Paper • 2502.12459 • Published Feb 18, 2025 • 3
LongAttn: Selecting Long-context Training Data via Token-level Attention Paper • 2502.16860 • Published Feb 24, 2025 • 1
Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision Paper • 2502.20790 • Published Feb 28, 2025
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Paper • 2503.04872 • Published Mar 6, 2025 • 15
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Paper • 2503.04872 • Published Mar 6, 2025 • 15
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Paper • 2503.04872 • Published Mar 6, 2025 • 15
Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance Paper • 2502.12459 • Published Feb 18, 2025 • 3