M Saad Salman's picture

4 267

M Saad Salman

MSS444

·

MSS444

AI & ML interests

None yet

Recent Activity

upvoted a paper about 9 hours ago

MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences

upvoted a paper about 10 hours ago

Controlled Self-Evolution for Algorithmic Code Optimization

upvoted a paper 3 days ago

X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests

View all activity

Organizations

None yet

upvoted a paper about 9 hours ago

MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences

Paper • 2601.06789 • Published 5 days ago • 73

upvoted a paper about 10 hours ago

Controlled Self-Evolution for Algorithmic Code Optimization

Paper • 2601.07348 • Published 3 days ago • 94

upvoted 3 papers 3 days ago

X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests

Paper • 2601.06953 • Published 4 days ago • 39

Lost in the Noise: How Reasoning Models Fail with Contextual Distractors

Paper • 2601.07226 • Published 4 days ago • 25

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Paper • 2601.05593 • Published 7 days ago • 72

upvoted 8 papers 4 days ago

One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling

Paper • 2601.03111 • Published 9 days ago • 8

Agent-as-a-Judge

Paper • 2601.05111 • Published 7 days ago • 16

Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models

Paper • 2512.21815 • Published 21 days ago • 20

DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation

Paper • 2601.04823 • Published 7 days ago • 5

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Paper • 2601.06021 • Published 6 days ago • 38

Can We Predict Before Executing Machine Learning Agents?

Paper • 2601.05930 • Published 6 days ago • 25

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Paper • 2601.06002 • Published 6 days ago • 48

MMFormalizer: Multimodal Autoformalization in the Wild

Paper • 2601.03017 • Published 9 days ago • 101

upvoted a paper 7 days ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published 7 days ago • 185

upvoted 5 papers 8 days ago

KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs

Paper • 2601.01046 • Published 13 days ago • 11

Recursive Language Models

Paper • 2512.24601 • Published 16 days ago • 62

Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

Paper • 2512.20578 • Published 23 days ago • 75

E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models

Paper • 2601.00423 • Published 14 days ago • 8

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published 10 days ago • 95

upvoted a paper 14 days ago

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published 15 days ago • 251