Model of MetaAPO https://arxiv.org/abs/2509.23371
junmingyang
jmyang
AI & ML interests
LLM Alignment, VLM
Recent Activity
upvoted a paper about 13 hours ago
SAAS: Self-Aware Reinforcement Learning for Over-Search Mitigation in Agentic Search upvoted a paper 6 days ago
CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents liked a model about 1 month ago
deepseek-ai/DeepSeek-V4-ProOrganizations
None yet