Efficient LLM for Taiwan

Efficient LLM for Taiwan

Taiwan ELM

Taiwan ELM is a family of Efficient LLMs for Taiwan base on apple/OpenELM. The project aims to provide an efficient model for researchers without access to large-scale computing resources.

The model is trained using a custom fork of LLaMA-Factory on 2B Traditional Chinese tokens and 500K instruction samples. We will extend the model to train on larger data sets and different base models if there is sufficient demand.

What is being released?

We release both pre-trained base models and instruction tuned variants with 270M and 1.1B parameters. Along with the model, datasets used to train the base and instruction-tuned models are also released.

List of released models:

List of released datasets:

Usage Examples

We adapt the LLaMA2 template:

<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_message }} [/INST]

The model could be load via AutoModelForCausalLM with trust_remote_code=True:

taiwanelm_270m = AutoModelForCausalLM.from_pretrained("liswei/Taiwan-ELM-270M", trust_remote_code=True)

We also support additional generation methods and speculative generation, please find reference at OpenELM#usage.

Downloads last month
30
Safetensors
Model size
310M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.

Model tree for liswei/Taiwan-ELM-270M-Instruct

Base model

apple/OpenELM-270M
Finetuned
(11)
this model

Datasets used to train liswei/Taiwan-ELM-270M-Instruct

Collection including liswei/Taiwan-ELM-270M-Instruct