metadata
license: apache-2.0
datasets:
- HuggingFaceTB/smollm-corpus
base_model:
- HuggingFaceTB/SmolLM-135M
pipeline_tag: text-generation
Research Paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs"