SJ-Ray commited on
Commit
e80f685
·
1 Parent(s): c215d2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md CHANGED
@@ -1,3 +1,28 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ <h2>Re-Punctuate:</h2>
5
+
6
+ Re-Punctuate is a T5 model that attempts to correct Capitalization and Punctuations in the sentences.
7
+
8
+ <h3>DataSet:</h3>
9
+
10
+ DialogSum dataset (115056 Records) was used to fine-tune the model for Punctuation and Capitalization correction.
11
+
12
+ <h3>Usage:</h3>
13
+
14
+ <pre>
15
+
16
+ from transformers import T5Tokenizer, TFT5ForConditionalGeneration
17
+ tokenizer = T5Tokenizer.from_pretrained('SJ-Ray/Re-Punctuate/')
18
+ model = TFT5ForConditionalGeneration.from_pretrained('SJ-Ray/Re-Punctuate/')
19
+ input_text = 'the story of this brave brilliant athlete whose very being was questioned so publicly is one that still captures the imagination'
20
+ inputs = tokenizer.encode("punctuate: " + input_text, return_tensors="tf")
21
+ result = model.generate(inputs)
22
+ decoded_output = tokenizer.decode(result[0], skip_special_tokens=True)
23
+ print(decoded_output)
24
+
25
+ </pre>
26
+ <h4> Example: </h4>
27
+ <b>Input:</b> the story of this brave brilliant athlete whose very being was questioned so publicly is one that still captures the imagination <br>
28
+ <b>Output:</b> The story of this brave, brilliant athlete, whose very being was questioned so publicly, is one that still captures the imagination.