raquelsilveira commited on
Commit
6474251
·
verified ·
1 Parent(s): f86d6c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -2
README.md CHANGED
@@ -7,7 +7,7 @@ license: openrail
7
 
8
  LegalBert-pt is a language model for the legal domain in the Portuguese language. The model was pre-trained to acquire specialization for the domain, and later it could be adjusted for use in specific tasks. Two versions of the model were created: one as a complement to the BERTimbau model, and the other from scratch. The effectiveness of the model based on BERTimbau was evident when analyzing the perplexity measure of the models. Experiments were also carried out in the tasks of identifying legal entities and classifying legal petitions. The results show that the use of specific language models outperforms those obtained using the generic language model in all tasks, suggesting that the specialization of the language model for the legal domain is an important factor for improving the accuracy of learning algorithms.
9
 
10
- Keywords: Language model, Legal Bert pt-br, Legal domain
11
 
12
  ## Available models
13
  |Model|Initial model|#Layers|#Params|
@@ -44,4 +44,22 @@ from transformers import AutoModel # or BertModel, for BERT without pretraining
44
 
45
  model = AutoModelForPreTraining.from_pretrained('raquelsilveira/legalbertpt_fp')
46
  tokenizer = AutoTokenizer.from_pretrained('raquelsilveira/legalbertpt_fp')
47
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
 
8
  LegalBert-pt is a language model for the legal domain in the Portuguese language. The model was pre-trained to acquire specialization for the domain, and later it could be adjusted for use in specific tasks. Two versions of the model were created: one as a complement to the BERTimbau model, and the other from scratch. The effectiveness of the model based on BERTimbau was evident when analyzing the perplexity measure of the models. Experiments were also carried out in the tasks of identifying legal entities and classifying legal petitions. The results show that the use of specific language models outperforms those obtained using the generic language model in all tasks, suggesting that the specialization of the language model for the legal domain is an important factor for improving the accuracy of learning algorithms.
9
 
10
+ Keywords: Language model, Legal Bert pt br, Legal domain, Portuguese Language Model
11
 
12
  ## Available models
13
  |Model|Initial model|#Layers|#Params|
 
44
 
45
  model = AutoModelForPreTraining.from_pretrained('raquelsilveira/legalbertpt_fp')
46
  tokenizer = AutoTokenizer.from_pretrained('raquelsilveira/legalbertpt_fp')
47
+ ```
48
+
49
+ ## Cite as
50
+
51
+ @inproceedings{10.1007/978-3-031-45392-2_18,
52
+ author = {Silveira, Raquel and Ponte, Caio and Almeida, Vitor and Pinheiro, Vl\'{a}dia and Furtado, Vasco},
53
+ title = {LegalBert-pt: A Pretrained Language Model for the Brazilian Portuguese Legal Domain},
54
+ year = {2023},
55
+ isbn = {978-3-031-45391-5},
56
+ publisher = {Springer-Verlag},
57
+ address = {Berlin, Heidelberg},
58
+ url = {https://doi.org/10.1007/978-3-031-45392-2_18},
59
+ doi = {10.1007/978-3-031-45392-2_18},
60
+ booktitle = {Intelligent Systems: 12th Brazilian Conference, BRACIS 2023, Belo Horizonte, Brazil, September 25–29, 2023, Proceedings, Part III},
61
+ pages = {268–282},
62
+ numpages = {15},
63
+ keywords = {BERTimbau, BERT, Legal Texts, Language Models},
64
+ location = {Belo Horizonte, Brazil}
65
+ }