Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ license: openrail
|
|
7 |
|
8 |
LegalBert-pt is a language model for the legal domain in the Portuguese language. The model was pre-trained to acquire specialization for the domain, and later it could be adjusted for use in specific tasks. Two versions of the model were created: one as a complement to the BERTimbau model, and the other from scratch. The effectiveness of the model based on BERTimbau was evident when analyzing the perplexity measure of the models. Experiments were also carried out in the tasks of identifying legal entities and classifying legal petitions. The results show that the use of specific language models outperforms those obtained using the generic language model in all tasks, suggesting that the specialization of the language model for the legal domain is an important factor for improving the accuracy of learning algorithms.
|
9 |
|
10 |
-
Keywords: Language model, Legal Bert pt
|
11 |
|
12 |
## Available models
|
13 |
|Model|Initial model|#Layers|#Params|
|
@@ -44,4 +44,22 @@ from transformers import AutoModel # or BertModel, for BERT without pretraining
|
|
44 |
|
45 |
model = AutoModelForPreTraining.from_pretrained('raquelsilveira/legalbertpt_fp')
|
46 |
tokenizer = AutoTokenizer.from_pretrained('raquelsilveira/legalbertpt_fp')
|
47 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
|
8 |
LegalBert-pt is a language model for the legal domain in the Portuguese language. The model was pre-trained to acquire specialization for the domain, and later it could be adjusted for use in specific tasks. Two versions of the model were created: one as a complement to the BERTimbau model, and the other from scratch. The effectiveness of the model based on BERTimbau was evident when analyzing the perplexity measure of the models. Experiments were also carried out in the tasks of identifying legal entities and classifying legal petitions. The results show that the use of specific language models outperforms those obtained using the generic language model in all tasks, suggesting that the specialization of the language model for the legal domain is an important factor for improving the accuracy of learning algorithms.
|
9 |
|
10 |
+
Keywords: Language model, Legal Bert pt br, Legal domain, Portuguese Language Model
|
11 |
|
12 |
## Available models
|
13 |
|Model|Initial model|#Layers|#Params|
|
|
|
44 |
|
45 |
model = AutoModelForPreTraining.from_pretrained('raquelsilveira/legalbertpt_fp')
|
46 |
tokenizer = AutoTokenizer.from_pretrained('raquelsilveira/legalbertpt_fp')
|
47 |
+
```
|
48 |
+
|
49 |
+
## Cite as
|
50 |
+
|
51 |
+
@inproceedings{10.1007/978-3-031-45392-2_18,
|
52 |
+
author = {Silveira, Raquel and Ponte, Caio and Almeida, Vitor and Pinheiro, Vl\'{a}dia and Furtado, Vasco},
|
53 |
+
title = {LegalBert-pt: A Pretrained Language Model for the Brazilian Portuguese Legal Domain},
|
54 |
+
year = {2023},
|
55 |
+
isbn = {978-3-031-45391-5},
|
56 |
+
publisher = {Springer-Verlag},
|
57 |
+
address = {Berlin, Heidelberg},
|
58 |
+
url = {https://doi.org/10.1007/978-3-031-45392-2_18},
|
59 |
+
doi = {10.1007/978-3-031-45392-2_18},
|
60 |
+
booktitle = {Intelligent Systems: 12th Brazilian Conference, BRACIS 2023, Belo Horizonte, Brazil, September 25–29, 2023, Proceedings, Part III},
|
61 |
+
pages = {268–282},
|
62 |
+
numpages = {15},
|
63 |
+
keywords = {BERTimbau, BERT, Legal Texts, Language Models},
|
64 |
+
location = {Belo Horizonte, Brazil}
|
65 |
+
}
|