metadata
quantized_by: LLMJapan
pipeline_tag: text-generation
license: cc-by-nc-4.0
language:
- en
tags:
- nvidia
- AceInstruct
- code
- math
- general_domain
- instruct_model
base_model: nvidia/AceInstruct-72B
Exllama v2 Quantizations of AceInstruct-72B by nvidia
Using turboderp's ExLlamaV2 v0.2.8 for quantization.
Original model: https://huggingface.co/nvidia/AceInstruct-72B
Quantization Command Example for creating other bpw quantization
cd {your git clone directory}
python convert.py -i {path to}/AceInstruct-72B -o {path to}/AceInstruct-72B/workingdir -cf {path to}/AceInstruct-72B/AceInstruct-72B-4bpw -b 4.0
Prompt format
<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
How to add your system prompt
Copy the following json and replace the "You are AceInstruct developed by NVIDIA. You are helpful assistant." sentence with your original system prompt. The default tokenizer_config.json does not have system prompt.
tokenizer_config.json
"chat_template": "{{- '<|im_start|>system\\nYou are AceInstruct developed by NVIDIA. You are helpful assistant.<|im_end|>\\n' }}\n {%- for message in messages %}\n{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n{{- '<|im_start|>assistant\n' }}\n{%- endif %}\n",
File information
quantization type | file size |
---|---|
4.0bpw | 35.9 GiB |
Benchmark Results
Qwen2.5-1.5B-Instruct | AceInstruct-1.5B | Qwen2.5-7B-Instruct | AceInstruct-7B | Qwen2.5-72B-Instruct | AceInstruct-72B | |
---|---|---|---|---|---|---|
HumanEval | 61.60 | 73.17 | 84.80 | 85.37 | 86.60 | 89.63 |
MBPP | 63.20 | 65.76 | 79.20 | 74.32 | 88.20 | 83.66 |
GSM8K | 73.20 | 80.44 | 91.60 | 93.10 | 95.80 | 96.36 |
MATH | 55.20 | 60.34 | 75.50 | 76.40 | 83.10 | 84.50 |
MMLU | 58.37 | 58.17 | 74.51 | 74.68 | 84.67 | 83.88 |
MMLU Pro | 32.40 | 33.78 | 56.30 | 54.50 | 71.10 | 66.10 |
Average | 57.33 | 61.94 | 76.99 | 76.40 | 84.91 | 84.02 |
Credits
Thanks to NVIDIA team.