MHA2MLA
Collection
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
•
17 items
•
Updated
Base model
HuggingFaceTB/SmolLM-360M