Ola Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment THUdyh/Ola-7b Any-to-Any • 9B • Updated Jun 23, 2025 • 66 • 45 THUdyh/Ola-Data Viewer • Updated Feb 24, 2025 • 363k • 4.26k • 9 THUdyh/Ola-Image 8B • Updated Jun 23, 2025 • 5 • 3 THUdyh/Ola-Video Video-Text-to-Text • 8B • Updated Feb 25, 2025 • 6 • 1
Oryx-1.5 Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution THUdyh/Oryx-1.5-7B Video-Text-to-Text • 8B • Updated Mar 1, 2025 • 8 • 8 THUdyh/Oryx-1.5-32B Text Generation • 33B • Updated Oct 22, 2024 • 14 • 1 THUdyh/Oryx-1.5-32B-Image Text Generation • 33B • Updated Jan 15, 2025 • 1 THUdyh/Oryx-1.5-Image Text Generation • 8B • Updated Jan 15, 2025
Insight-V Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models THUdyh/Insight-V-Summary Text Generation • 8B • Updated Nov 22, 2024 • 14 • 9 THUdyh/Insight-V-Reason Text Generation • 8B • Updated Nov 22, 2024 • 19 • 10 THUdyh/Insight-V-Reason-LLaMA3 Text Generation • 8B • Updated Jul 4, 2025 • 3 • 4 THUdyh/Insight-V-Summary-LLaMA3 Text Generation • 8B • Updated Jul 4, 2025 • 2 • 4
Oryx Oryx: One Multi-Modal LLM for On-Demand Spatial-Temporal Understanding THUdyh/Oryx-7B Text Generation • 8B • Updated Jul 30, 2025 • 15 • 12 THUdyh/Oryx-34B Video-Text-to-Text • 35B • Updated Mar 1, 2025 • 8 • 3 THUdyh/Oryx-7B-Image Text Generation • 8B • Updated Sep 23, 2024 • 2 • 3 THUdyh/Oryx-34B-Image Text Generation • 35B • Updated Sep 23, 2024 • 2
Ola Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment THUdyh/Ola-7b Any-to-Any • 9B • Updated Jun 23, 2025 • 66 • 45 THUdyh/Ola-Data Viewer • Updated Feb 24, 2025 • 363k • 4.26k • 9 THUdyh/Ola-Image 8B • Updated Jun 23, 2025 • 5 • 3 THUdyh/Ola-Video Video-Text-to-Text • 8B • Updated Feb 25, 2025 • 6 • 1
Insight-V Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models THUdyh/Insight-V-Summary Text Generation • 8B • Updated Nov 22, 2024 • 14 • 9 THUdyh/Insight-V-Reason Text Generation • 8B • Updated Nov 22, 2024 • 19 • 10 THUdyh/Insight-V-Reason-LLaMA3 Text Generation • 8B • Updated Jul 4, 2025 • 3 • 4 THUdyh/Insight-V-Summary-LLaMA3 Text Generation • 8B • Updated Jul 4, 2025 • 2 • 4
Oryx-1.5 Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution THUdyh/Oryx-1.5-7B Video-Text-to-Text • 8B • Updated Mar 1, 2025 • 8 • 8 THUdyh/Oryx-1.5-32B Text Generation • 33B • Updated Oct 22, 2024 • 14 • 1 THUdyh/Oryx-1.5-32B-Image Text Generation • 33B • Updated Jan 15, 2025 • 1 THUdyh/Oryx-1.5-Image Text Generation • 8B • Updated Jan 15, 2025
Oryx Oryx: One Multi-Modal LLM for On-Demand Spatial-Temporal Understanding THUdyh/Oryx-7B Text Generation • 8B • Updated Jul 30, 2025 • 15 • 12 THUdyh/Oryx-34B Video-Text-to-Text • 35B • Updated Mar 1, 2025 • 8 • 3 THUdyh/Oryx-7B-Image Text Generation • 8B • Updated Sep 23, 2024 • 2 • 3 THUdyh/Oryx-34B-Image Text Generation • 35B • Updated Sep 23, 2024 • 2