Olaf-World: Orienting Latent Actions for Video World Modeling Paper • 2602.10104 • Published 2 days ago • 25
Context Forcing: Consistent Autoregressive Video Generation with Long Context Paper • 2602.06028 • Published 7 days ago • 35
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation Paper • 2602.03796 • Published 9 days ago • 56
Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals Paper • 2601.05848 • Published Jan 9 • 16
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control Paper • 2601.05138 • Published Jan 8 • 18
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published Oct 27, 2025 • 179
GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published Oct 22, 2025 • 51
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Paper • 2510.15742 • Published Oct 17, 2025 • 51
Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24, 2025 • 82
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction Paper • 2509.19297 • Published Sep 23, 2025 • 25
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8, 2025 • 206
IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering Paper • 2506.23329 • Published Jun 29, 2025 • 8
IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering Paper • 2506.23329 • Published Jun 29, 2025 • 8