Update README.md
Browse files
README.md
CHANGED
@@ -7,11 +7,11 @@ language:
|
|
7 |
- en
|
8 |
---
|
9 |
|
10 |
-
# Llama3-8B
|
11 |
|
12 |
-
This repository
|
13 |
|
14 |
-
##
|
15 |
|
16 |
This Lora Adapter has been specifically fine-tuned to understand and generate text in Galician. It was refined using a modified version of the [irlab-udc/alpaca_data_galician](https://huggingface.co/datasets/irlab-udc/alpaca_data_galician) dataset, enriched with synthetic data to enhance its text generation and comprehension capabilities in specific contexts.
|
17 |
|
@@ -19,7 +19,7 @@ This Lora Adapter has been specifically fine-tuned to understand and generate te
|
|
19 |
|
20 |
- **Base Model**: Unsloth Meta's LLaMA 3 8B Instruct (https://huggingface.co/unsloth/llama-3-8b-Instruct-bnb-4bit)
|
21 |
- **Fine-Tuning Platform**: LLaMA Factory
|
22 |
-
- **Infrastructure**: Finisterrae III, CESGA
|
23 |
- **Dataset**: [irlab-udc/alpaca_data_galician](https://huggingface.co/datasets/irlab-udc/alpaca_data_galician) (with modifications)
|
24 |
- **Fine-Tuning Objective**: To improve text comprehension and generation in Galician.
|
25 |
|
@@ -60,21 +60,25 @@ User: Cantos habitantes ten Galicia?
|
|
60 |
Assistant: Segundo as 煤ltimas estimaci贸ns, Galicia ten uns 2,8 mill贸ns de habitantes.
|
61 |
```
|
62 |
|
63 |
-
## How to Use the
|
64 |
|
65 |
To use this adapter, follow the example code provided below. Ensure you have the necessary libraries installed (e.g., Hugging Face's `transformers`).
|
66 |
|
67 |
### Installation
|
|
|
|
|
68 |
```bash
|
69 |
git clone https://huggingface.co/abrahammg/Llama3-8B-Galician-Chat-Lora
|
70 |
```
|
71 |
-
|
72 |
```bash
|
73 |
pip install transformers bitsandbytes "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" llmtuner xformers
|
74 |
```
|
75 |
|
76 |
### Run the adapter
|
77 |
|
|
|
|
|
78 |
```bash
|
79 |
from llmtuner import ChatModel
|
80 |
from llmtuner.extras.misc import torch_gc
|
@@ -100,17 +104,22 @@ while True:
|
|
100 |
print("History has been removed.")
|
101 |
continue
|
102 |
|
103 |
-
messages.append({"role": "user", "content": query})
|
104 |
print("Assistant: ", end="", flush=True)
|
105 |
response = ""
|
106 |
-
for new_text in chat_model.stream_chat(messages):
|
107 |
print(new_text, end="", flush=True)
|
108 |
response += new_text
|
109 |
print()
|
110 |
-
messages.append({"role": "assistant", "content": response})
|
111 |
|
112 |
torch_gc()
|
113 |
```
|
|
|
|
|
|
|
|
|
|
|
114 |
## Citation
|
115 |
|
116 |
```markdown
|
@@ -127,3 +136,4 @@ torch_gc()
|
|
127 |
|
128 |
- [meta-llama/llama3](https://github.com/meta-llama/llama3)
|
129 |
- [hiyouga/LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
|
|
|
|
7 |
- en
|
8 |
---
|
9 |
|
10 |
+
# Llama3-8B Lora adapter for Galician language
|
11 |
|
12 |
+
This repository houses a specialized LoRA (Low-Rank Adaptation) Adapter designed specifically for fine-tuning Meta's LLaMA 3-8B Instruct version for applications involving the Galician language. The purpose of this adapter is to efficiently adapt the pre-trained model, which has been initially trained on a broad range of data and languages, to better understand and generate text in Galician.
|
13 |
|
14 |
+
## Adapter Description
|
15 |
|
16 |
This Lora Adapter has been specifically fine-tuned to understand and generate text in Galician. It was refined using a modified version of the [irlab-udc/alpaca_data_galician](https://huggingface.co/datasets/irlab-udc/alpaca_data_galician) dataset, enriched with synthetic data to enhance its text generation and comprehension capabilities in specific contexts.
|
17 |
|
|
|
19 |
|
20 |
- **Base Model**: Unsloth Meta's LLaMA 3 8B Instruct (https://huggingface.co/unsloth/llama-3-8b-Instruct-bnb-4bit)
|
21 |
- **Fine-Tuning Platform**: LLaMA Factory
|
22 |
+
- **Infrastructure**: Finisterrae III Supercomputer, CESGA (Galicia-Spain)
|
23 |
- **Dataset**: [irlab-udc/alpaca_data_galician](https://huggingface.co/datasets/irlab-udc/alpaca_data_galician) (with modifications)
|
24 |
- **Fine-Tuning Objective**: To improve text comprehension and generation in Galician.
|
25 |
|
|
|
60 |
Assistant: Segundo as 煤ltimas estimaci贸ns, Galicia ten uns 2,8 mill贸ns de habitantes.
|
61 |
```
|
62 |
|
63 |
+
## How to Use the Adapter
|
64 |
|
65 |
To use this adapter, follow the example code provided below. Ensure you have the necessary libraries installed (e.g., Hugging Face's `transformers`).
|
66 |
|
67 |
### Installation
|
68 |
+
|
69 |
+
Download de adapter from huggingface:
|
70 |
```bash
|
71 |
git clone https://huggingface.co/abrahammg/Llama3-8B-Galician-Chat-Lora
|
72 |
```
|
73 |
+
Install dependencies:
|
74 |
```bash
|
75 |
pip install transformers bitsandbytes "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" llmtuner xformers
|
76 |
```
|
77 |
|
78 |
### Run the adapter
|
79 |
|
80 |
+
Create a python script (ex. run_model.py):
|
81 |
+
|
82 |
```bash
|
83 |
from llmtuner import ChatModel
|
84 |
from llmtuner.extras.misc import torch_gc
|
|
|
104 |
print("History has been removed.")
|
105 |
continue
|
106 |
|
107 |
+
messages.append({"role": "user", "content": query})
|
108 |
print("Assistant: ", end="", flush=True)
|
109 |
response = ""
|
110 |
+
for new_text in chat_model.stream_chat(messages):
|
111 |
print(new_text, end="", flush=True)
|
112 |
response += new_text
|
113 |
print()
|
114 |
+
messages.append({"role": "assistant", "content": response})
|
115 |
|
116 |
torch_gc()
|
117 |
```
|
118 |
+
and run it
|
119 |
+
```bash
|
120 |
+
python run_model.py
|
121 |
+
```
|
122 |
+
|
123 |
## Citation
|
124 |
|
125 |
```markdown
|
|
|
136 |
|
137 |
- [meta-llama/llama3](https://github.com/meta-llama/llama3)
|
138 |
- [hiyouga/LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
|
139 |
+
- [irlab-udc/alpaca_data_galician](https://huggingface.co/datasets/irlab-udc/alpaca_data_galician)
|