Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths β’ 2 items β’ Updated 1 day ago β’ 73
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper β’ 2412.18925 β’ Published Dec 25, 2024 β’ 97
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario Paper β’ 2501.10132 β’ Published 10 days ago β’ 14
Eagle 2 Collection Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. β’ 9 items β’ Updated 4 days ago β’ 19
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper β’ 2501.12326 β’ Published 6 days ago β’ 45
Phi-4 (All Versions) Collection Microsoft's new Phi-4 model in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. β’ 4 items β’ Updated 7 days ago β’ 34
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. β’ 27 items β’ Updated about 16 hours ago β’ 68
Multilingual LLM Evaluation Collection Multilingual Evaluation Benchmarks β’ 6 items β’ Updated Dec 13, 2024 β’ 10
Aya Datasets Collection The Aya Collection is a massive multilingual collection for over 100 languages consisting of 513 million instances of prompts and completions. β’ 5 items β’ Updated Dec 3, 2024 β’ 15
C4AI Aya Expanse Collection Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. β’ 3 items β’ Updated Dec 16, 2024 β’ 30
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper β’ 2501.08313 β’ Published 13 days ago β’ 268
Agent Laboratory: Using LLM Agents as Research Assistants Paper β’ 2501.04227 β’ Published 20 days ago β’ 81
Agentless: Demystifying LLM-based Software Engineering Agents Paper β’ 2407.01489 β’ Published Jul 1, 2024 β’ 59
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram β’ Dec 4, 2024 β’ 76