LongHeads: Multi-Head Attention is Secretly a Long Context Processor Paper • 2402.10685 • Published Feb 16, 2024 • 1
Length Generalization of Causal Transformers without Position Encoding Paper • 2404.12224 • Published Apr 18, 2024 • 1
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs Paper • 2502.14837 • Published 18 days ago • 1
Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs Paper • 2410.11302 • Published Oct 15, 2024
Investigating and Mitigating Object Hallucinations in Pretrained Vision-Language (CLIP) Models Paper • 2410.03176 • Published Oct 4, 2024