KomeijiForce
/

bart-base-emojilm-e2t

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

KomeijiForce commited on Nov 11, 2023

Commit

df8dc88

·

1 Parent(s): ad00394

Update README.md

Files changed (1) hide show

README.md +57 -48

README.md CHANGED Viewed

@@ -1,51 +1,60 @@
 ---
-tags:
-- generated_from_trainer
-model-index:
-- name: bart-base-emolm-translate-rev
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# bart-base-emolm-translate-rev
-This model is a fine-tuned version of [./saved_models/bart-base](https://huggingface.co/./saved_models/bart-base) on an unknown dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 3e-05
-- train_batch_size: 16
-- eval_batch_size: 64
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 2000
-- num_epochs: 2.0
-### Training results
-### Framework versions
-- Transformers 4.29.2
-- Pytorch 2.0.0+cu117
-- Datasets 2.12.0
-- Tokenizers 0.12.1

 ---
+datasets:
+- KomeijiForce/Text2Emoji
+language:
+- en
+metrics:
+- bertscore
+pipeline_tag: text2text-generation
 ---
+# EmojiLM
+This is a [BART](https://huggingface.co/facebook/bart-base) model pre-trained on the [Text2Emoji](https://huggingface.co/datasets/KomeijiForce/Text2Emoji) dataset to translate emojis into texts.
+For instance, "🍕😍" will be translated into "I love pizza".
+An example implementation for translation:
+```python
+from transformers import BartTokenizer, BartForConditionalGeneration
+def translate(sentence, **argv):
+    inputs = tokenizer(sentence, return_tensors="pt")
+    generated_ids = generator.generate(inputs["input_ids"], **argv)
+    decoded = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
+    return decoded
+path = "KomeijiForce/bart-base-emojilm-e2t"
+tokenizer = BartTokenizer.from_pretrained(path)
+generator = BartForConditionalGeneration.from_pretrained(path)
+sentence = "🍣🍱😋"
+decoded = translate(sentence, num_beams=4, do_sample=True, max_length=100)
+print(decoded)
+```
+You will probably get some output like "Sushi is my go-to comfort food."
+If you find this model & dataset resource useful, please consider cite our paper:
+```
+@article{DBLP:journals/corr/abs-2311-01751,
+  author       = {Letian Peng and
+                  Zilong Wang and
+                  Hang Liu and
+                  Zihan Wang and
+                  Jingbo Shang},
+  title        = {EmojiLM: Modeling the New Emoji Language},
+  journal      = {CoRR},
+  volume       = {abs/2311.01751},
+  year         = {2023},
+  url          = {https://doi.org/10.48550/arXiv.2311.01751},
+  doi          = {10.48550/ARXIV.2311.01751},
+  eprinttype    = {arXiv},
+  eprint       = {2311.01751},
+  timestamp    = {Tue, 07 Nov 2023 18:17:14 +0100},
+  biburl       = {https://dblp.org/rec/journals/corr/abs-2311-01751.bib},
+  bibsource    = {dblp computer science bibliography, https://dblp.org}
+}
+```