papers - a kziemski Collection

kziemski 's Collections

papers

papers

updated about 10 hours ago

SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights

Paper • 2410.09008 • Published Oct 11, 2024 • 17
answerdotai/ModernBERT-base

Fill-Mask • Updated Jan 15 • 6.09M • 784
answerdotai/ModernBERT-large

Fill-Mask • Updated Jan 15 • 273k • 360
microsoft/phi-4

Text Generation • Updated 14 days ago • 563k • • 1.88k
deepseek-ai/DeepSeek-R1

Text Generation • Updated 14 days ago • 3.64M • • 11.1k
deepseek-ai/DeepSeek-R1-Zero

Text Generation • Updated 14 days ago • 11.6k • 860
Qwen/QwQ-32B-Preview

Text Generation • Updated Jan 12 • 258k • • 1.7k
microsoft/Phi-3.5-MoE-instruct

Text Generation • Updated 2 days ago • 38.9k • • 554
ibm-granite/granite-3.2-8b-instruct-preview

Text Generation • Updated 12 days ago • 9.48k • 68
LDJnr/Capybara

Viewer • Updated Jun 7, 2024 • 16k • 461 • 237
agentica-org/DeepScaleR-1.5B-Preview

Text Generation • Updated 15 days ago • 63.4k • • 514
Running

68

68

Open FinLLM Leaderboard

🥇

Browse and submit large language model evaluations
onnx-community/Kokoro-82M-ONNX

Text-to-Speech • Updated about 1 month ago • 21k • 127
HuggingFaceTB/SmolVLM2-256M-Video-Instruct

Image-Text-to-Text • Updated 4 days ago • 4.16k • 38
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published 18 days ago • 44
homebrewltd/AlphaMaze-v0.2-1.5B

Text Generation • Updated 14 days ago • 1.86k • • 89
qihoo360/TinyR1-32B-Preview

Text Generation • Updated about 2 hours ago • 4.86k • 314
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published 14 days ago • 26
microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • Updated 2 days ago • 231k • 1.05k
microsoft/Magma-8B

Image-Text-to-Text • Updated 5 days ago • 10.5k • 324
Physics of Language Models: Part 1, Context-Free Grammar

Paper • 2305.13673 • Published May 23, 2023 • 7
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 35
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

Paper • 2408.16293 • Published Aug 29, 2024 • 26
Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

Paper • 2407.20311 • Published Jul 29, 2024 • 5
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws

Paper • 2404.05405 • Published Apr 8, 2024 • 10
Reverse Training to Nurse the Reversal Curse

Paper • 2403.13799 • Published Mar 20, 2024 • 13
Physics of Language Models: Part 3.2, Knowledge Manipulation

Paper • 2309.14402 • Published Sep 25, 2023 • 7
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction

Paper • 2309.14316 • Published Sep 25, 2023 • 8
yale-nlp/FOLIO

Viewer • Updated Dec 21, 2023 • 1.2k • 1.47k • 35
Qwen/QwQ-32B

Text Generation • Updated 3 days ago • 103k • • 1.76k
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks

Paper • 2502.17157 • Published 14 days ago • 51
secemp9/TraceBack-12b

Text Generation • Updated about 19 hours ago • 17