Papers
arxiv:2401.06620

TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language Models

Published on Jan 12, 2024
Authors:
,
,

Abstract

The world's more than 7000 languages are written in at least 293 scripts. Due to various reasons, many closely related languages use different scripts, which poses a difficulty for multilingual pretrained language models (mPLMs) in learning crosslingual knowledge through lexical overlap. As a consequence, mPLMs are faced with a script barrier: representations from different scripts are located in different subspaces, which can result in crosslingual transfer involving languages of different scripts performing suboptimally. To address this problem, we propose <PRE_TAG><PRE_TAG><PRE_TAG>TransliCo</POST_TAG></POST_TAG></POST_TAG>, a framework that optimizes the <PRE_TAG><PRE_TAG><PRE_TAG>Transliteration Contrastive Modeling (TCM)</POST_TAG></POST_TAG></POST_TAG> objective to fine-tune an mPLM by contrasting sentences in its training data and their transliterations in a unified script (in our case Latin), which enhances uniformity in the representation space for different scripts. Using <PRE_TAG><PRE_TAG>Glot500-m</POST_TAG></POST_TAG>, an mPLM pretrained on over 500 languages, as our source model, we fine-tune it on a small portion (5%) of its training data, and refer to the resulting model as <PRE_TAG><PRE_TAG>Furina</POST_TAG></POST_TAG>. We show that <PRE_TAG><PRE_TAG>Furina</POST_TAG></POST_TAG> not only better aligns representations from distinct scripts but also outperforms the original <PRE_TAG><PRE_TAG>Glot500-m</POST_TAG></POST_TAG> on various zero-shot crosslingual transfer tasks. Additionally, we achieve consistent improvement in a case study on the <PRE_TAG>Indic group</POST_TAG> where the languages exhibit areal features but use different scripts. We make our code and models publicly available.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2401.06620 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2401.06620 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2401.06620 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.