Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published Dec 15, 2025 • 106
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published Dec 29, 2025 • 98
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation Paper • 2512.23576 • Published Dec 29, 2025 • 65
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published Dec 31, 2025 • 151
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation Paper • 2512.24551 • Published Dec 31, 2025 • 21
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published Dec 31, 2025 • 105
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling Paper • 2512.23959 • Published Dec 30, 2025 • 112
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published Jan 5 • 109
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published Jan 8 • 167
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published Dec 9, 2025 • 132
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 509
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 22 days ago • 339
ASA: Training-Free Representation Engineering for Tool-Calling Agents Paper • 2602.04935 • Published 23 days ago • 41
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published 16 days ago • 185
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics Paper • 2602.12617 • Published 14 days ago • 20
MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs Paper • 2602.12705 • Published 14 days ago • 62
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published 16 days ago • 52
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 14 days ago • 53
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published 17 days ago • 194
Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents Paper • 2602.16855 • Published 13 days ago • 46
World Craft: Agentic Framework to Create Visualizable Worlds via Text Paper • 2601.09150 • Published Jan 14 • 20
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training Paper • 2602.10693 • Published 16 days ago • 185
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published 18 days ago • 211
On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published 3 days ago • 87
Query-focused and Memory-aware Reranker for Long Context Processing Paper • 2602.12192 • Published 15 days ago • 49
HyTRec: A Hybrid Temporal-Aware Attention Architecture for Long Behavior Sequential Recommendation Paper • 2602.18283 • Published 7 days ago • 51