view article Article Atlaset Dataset for Moroccan Darija: From Data Collection, Analysis, to Model Trainings By atlasia and 1 other • 7 days ago • 18
Gemstone Models Collection Our 22 open source Gemstone models for scaling laws range from 50M to 2B parameters, spanning 11 widths from 256 to 3072 and 18 depths from 3 to 80. • 59 items • Updated 15 days ago • 5
Ovis2 Collection Our latest advancement in multi-modal large language models (MLLMs) • 8 items • Updated 24 days ago • 55
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published 21 days ago • 45
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 8 items • Updated 17 days ago • 396
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 345
view article Article TerjamaBench: A Cultural Benchmark for English-Darija Machine Translation By imomayiz and 4 others • Jan 10 • 30
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 262
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published Jan 7 • 50
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 93
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published Dec 12, 2024 • 94