-
Slamming: Training a Speech Language Model on One GPU in a Day
Paper • 2502.15814 • Published • 66 -
Small Models Struggle to Learn from Strong Reasoners
Paper • 2502.12143 • Published • 28 -
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
Paper • 2502.12574 • Published • 11 -
Large Language Diffusion Models
Paper • 2502.09992 • Published • 99
Shiwon Jeong
sebastianrcnt
AI & ML interests
None yet
Recent Activity
updated
a collection
8 days ago
interesting
updated
a collection
8 days ago
interesting
liked
a model
8 days ago
GSAI-ML/LLaDA-8B-Instruct
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet