pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated 2 days ago • 72
LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published 19 days ago • 67
gpt-oss-safeguard Collection gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss • 2 items • Updated Oct 29, 2025 • 64
Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning Paper • 2602.10090 • Published 18 days ago • 51
QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining Paper • 2602.07085 • Published 22 days ago • 185
Self-Improving World Modelling with Latent Actions Paper • 2602.06130 • Published 23 days ago • 30
OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models Paper • 2602.04804 • Published 24 days ago • 46
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published 25 days ago • 45
Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers Paper • 2602.03510 • Published 25 days ago • 27
Rethinking the Trust Region in LLM Reinforcement Learning Paper • 2602.04879 • Published 24 days ago • 35
Training Data Efficiency in Multimodal Process Reward Models Paper • 2602.04145 • Published 25 days ago • 76
ProRAG Collection The models of the paper "ProRAG: Process-Supervised Reinforcement Learning for Retrieval-Augmented Generation" • 2 items • Updated 24 days ago • 2
Linear representations in language models can change dramatically over a conversation Paper • 2601.20834 • Published Jan 28 • 21
Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning Paper • 2601.20209 • Published Jan 28 • 22
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery Paper • 2601.19325 • Published Jan 27 • 79