Bowen232 commited on
Commit
4385b31
·
verified ·
1 Parent(s): 762668d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -3
README.md CHANGED
@@ -1,12 +1,50 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  We released all of our checkpoints used in [LoRA-Flow](https://aclanthology.org/2024.acl-long.695.pdf) which has been accepted to ACL 2024 main conference.
5
  # Summary
6
- In this repo, we release LoRA and the gate of 7B models trained in our paper in HuggingFace format.
7
- # Method
8
  The following picture has shown our proposed method, we use layer-wise fusion gates to facilitate dynamic LoRA fusion, which project input hidden states of each layer into fusion weights.
9
  ![1.jpg](https://cdn-uploads.huggingface.co/production/uploads/64d99f6cd7e30889c6c477b4/ifiu1FTHilrmUkD4FKkgV.jpeg)
10
  # Citation
11
  if you find our repo is helpful, please cite the following
12
- [LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks](https://aclanthology.org/2024.acl-long.695)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ <div align="center">
5
+
6
+ <!-- <img src="title.png" alt="LoRA-Flow" width="200"> -->
7
+ <!-- ***LORA-Flow*** -->
8
+ <b><i style="font-size: 24px;">LORA-Flow</i></b>
9
+
10
+ LoRAs and fusion gates for our paper
11
+
12
+ <p align="center">
13
+ <a href="https://aclanthology.org/2024.acl-long.695/">Paper</a> •
14
+ <a href="https://github.com/pingbowen23/LoRA-Flow"> Github</a>
15
+ </p>
16
+ </div>
17
+
18
  We released all of our checkpoints used in [LoRA-Flow](https://aclanthology.org/2024.acl-long.695.pdf) which has been accepted to ACL 2024 main conference.
19
  # Summary
20
+ > In this repo, we release LoRA and the gate of 7B models trained in our paper in HuggingFace format.
21
+ # Introduction
22
  The following picture has shown our proposed method, we use layer-wise fusion gates to facilitate dynamic LoRA fusion, which project input hidden states of each layer into fusion weights.
23
  ![1.jpg](https://cdn-uploads.huggingface.co/production/uploads/64d99f6cd7e30889c6c477b4/ifiu1FTHilrmUkD4FKkgV.jpeg)
24
  # Citation
25
  if you find our repo is helpful, please cite the following
26
+ ```bibtex
27
+ @inproceedings{wang-etal-2024-lora-flow,
28
+ title = "LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks",
29
+ author = "Wang, Hanqing and
30
+ Ping, Bowen and
31
+ Wang, Shuo and
32
+ Han, Xu and
33
+ Chen, Yun and
34
+ Liu, Zhiyuan and
35
+ Sun, Maosong",
36
+ editor = "Ku, Lun-Wei and
37
+ Martins, Andre and
38
+ Srikumar, Vivek",
39
+ booktitle = "Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
40
+ month = aug,
41
+ year = "2024",
42
+ address = "Bangkok, Thailand",
43
+ publisher = "Association for Computational Linguistics",
44
+ url = "https://aclanthology.org/2024.acl-long.695",
45
+ doi = "10.18653/v1/2024.acl-long.695",
46
+ pages = "12871--12882",
47
+ abstract = "LoRA employs lightweight modules to customize large language models (LLMs) for each downstream task or domain, where different learned additional modules represent diverse skills. Combining existing LoRAs to address new tasks can enhance the reusability of learned LoRAs, particularly beneficial for tasks with limited annotated data. Most prior works on LoRA combination primarily rely on task-level weights for each involved LoRA, making different examples and tokens share the same LoRA weights. However, in generative tasks, different tokens may necessitate diverse skills to manage. Taking the Chinese math task as an example, understanding the problem description may depend more on the Chinese LoRA, while the calculation part may rely more on the math LoRA. To this end, we propose LoRA-Flow, which utilizes dynamic weights to adjust the impact of different LoRAs. The weights at each step are determined by a fusion gate with extremely few parameters, which can be learned with only 200 training examples. Experiments across six generative tasks demonstrate that our method consistently outperforms baselines with task-level fusion weights. This underscores the necessity of introducing dynamic fusion weights for LoRA combination.",
48
+ }
49
+ ```
50
+ <!-- [LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks](https://aclanthology.org/2024.acl-long.695) -->