LLMJapan's picture
Update README.md
9f8a75a verified
|
raw
history blame
2.33 kB
metadata
quantized_by: LLMJapan
pipeline_tag: text-generation
license: cc-by-nc-4.0
language:
  - en
tags:
  - nvidia
  - AceInstruct
  - code
  - math
  - general_domain
  - instruct_model
base_model: nvidia/AceInstruct-72B

Exllama v2 Quantizations of AceInstruct-72B by nvidia

Using turboderp's ExLlamaV2 v0.2.8 for quantization.

Original model: https://huggingface.co/nvidia/AceInstruct-72B

Quantization Command Example for creating other bpw quantization

cd {your git clone directory}
python convert.py -i {path to}/AceInstruct-72B -o {path to}/AceInstruct-72B/workingdir -cf {path to}/AceInstruct-72B/AceInstruct-72B-4bpw -b 4.0

Prompt format

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

How to add your system prompt

Copy the following json and replace the "You are AceInstruct developed by NVIDIA. You are helpful assistant." sentence with your original system prompt. The default tokenizer_config.json does not have system prompt.

tokenizer_config.json

"chat_template": "{{- '<|im_start|>system\\nYou are AceInstruct developed by NVIDIA. You are helpful assistant.<|im_end|>\\n' }}\n    {%- for message in messages %}\n{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n{{- '<|im_start|>assistant\n' }}\n{%- endif %}\n",

File information

quantization type file size
4.0bpw 35.9 GiB

Benchmark Results

Qwen2.5-1.5B-Instruct AceInstruct-1.5B Qwen2.5-7B-Instruct AceInstruct-7B Qwen2.5-72B-Instruct AceInstruct-72B
HumanEval 61.60 73.17 84.80 85.37 86.60 89.63
MBPP 63.20 65.76 79.20 74.32 88.20 83.66
GSM8K 73.20 80.44 91.60 93.10 95.80 96.36
MATH 55.20 60.34 75.50 76.40 83.10 84.50
MMLU 58.37 58.17 74.51 74.68 84.67 83.88
MMLU Pro 32.40 33.78 56.30 54.50 71.10 66.10
Average 57.33 61.94 76.99 76.40 84.91 84.02

Credits

Thanks to NVIDIA team.


license: cc-by-nc-4.0