Update README.md
Browse files
README.md
CHANGED
@@ -5,4 +5,23 @@ language:
|
|
5 |
pipeline_tag: text-generation
|
6 |
---
|
7 |
|
8 |
-
The Simbolo's Myanmar SAR GPT symbol is trained on a dataset of 1 million Burmese data and pre-trained using the GPT-2 architecture. Its purpose is to serve as a foundational pre-trained model for the Burmese language, facilitating fine-tuning for specific applications of different tasks such as creative writing, chatbot, machine translation etc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
pipeline_tag: text-generation
|
6 |
---
|
7 |
|
8 |
+
The Simbolo's Myanmar SAR GPT symbol is trained on a dataset of 1 million Burmese data and pre-trained using the GPT-2 architecture. Its purpose is to serve as a foundational pre-trained model for the Burmese language, facilitating fine-tuning for specific applications of different tasks such as creative writing, chatbot, machine translation etc.
|
9 |
+
|
10 |
+
|
11 |
+
|
12 |
+
### How to use
|
13 |
+
|
14 |
+
```python
|
15 |
+
!pip install transformers
|
16 |
+
|
17 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
18 |
+
|
19 |
+
tokenizer = AutoTokenizer.from_pretrained("Simbolo-Servicio/myanmar-sar-gpt")
|
20 |
+
model = AutoModelForCausalLM.from_pretrained("Simbolo-Servicio/myanmar-sar-gpt")
|
21 |
+
|
22 |
+
input_text = ""
|
23 |
+
input_ids = tokenizer.encode(input_text, return_tensors='pt')
|
24 |
+
output = model.generate(input_ids, max_length=100)
|
25 |
+
print(tokenizer.decode(output[0], skip_special_tokens=True))
|
26 |
+
|
27 |
+
### Limitations and bias
|