SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters Paper • 2210.04284 • Published Oct 9, 2022
A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer Paper • 2402.02464 • Published Feb 4, 2024
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training Paper • 2406.16554 • Published Jun 24, 2024 • 1