|
--- |
|
quantized_by: LLMJapan |
|
pipeline_tag: text-generation |
|
license: cc-by-nc-4.0 |
|
language: |
|
- en |
|
tags: |
|
- nvidia |
|
- AceInstruct |
|
- code |
|
- math |
|
- general_domain |
|
- instruct_model |
|
base_model: nvidia/AceInstruct-72B |
|
--- |
|
## Exllama v2 Quantizations of AceInstruct-72B by nvidia |
|
|
|
Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.2.8">turboderp's ExLlamaV2 v0.2.8</a> for quantization. |
|
|
|
Original model: https://huggingface.co/nvidia/AceInstruct-72B |
|
|
|
Quantization Command Example for creating other bpw quantization |
|
``` |
|
cd {your git clone directory} |
|
python convert.py -i {path to}/AceInstruct-72B -o {path to}/AceInstruct-72B/workingdir -cf {path to}/AceInstruct-72B/AceInstruct-72B-3bpw -b 3.0 |
|
``` |
|
|
|
## Prompt format |
|
|
|
``` |
|
<|im_start|>system |
|
{system_prompt}<|im_end|> |
|
<|im_start|>user |
|
{prompt}<|im_end|> |
|
<|im_start|>assistant |
|
``` |
|
|
|
## How to add your system prompt |
|
|
|
Copy the following json and replace the "You are AceInstruct developed by NVIDIA. You are helpful assistant." sentence with your original system prompt. |
|
The default tokenizer_config.json does not have system prompt. |
|
|
|
tokenizer_config.json |
|
``` |
|
"chat_template": "{{- '<|im_start|>system\\nYou are AceInstruct developed by NVIDIA. You are helpful assistant.<|im_end|>\\n' }}\n {%- for message in messages %}\n{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n{{- '<|im_start|>assistant\n' }}\n{%- endif %}\n", |
|
``` |
|
|
|
## File information |
|
|
|
| quantization type | file size | |
|
| ----------------------- | ----------: | |
|
| 3.0bpw | 27.8 GiB | |
|
|
|
## Benchmark Results |
|
|
|
| | Qwen2.5-1.5B-Instruct | AceInstruct-1.5B | Qwen2.5-7B-Instruct | AceInstruct-7B | Qwen2.5-72B-Instruct | AceInstruct-72B | |
|
| --------- |:-----:|:-----:|:-----:|:-----:|:-----:|:-----:| |
|
| HumanEval | 61.60 | 73.17 | 84.80 | 85.37 | 86.60 | 89.63 | |
|
| MBPP | 63.20 | 65.76 | 79.20 | 74.32 | 88.20 | 83.66 | |
|
| GSM8K | 73.20 | 80.44 | 91.60 | 93.10 | 95.80 | 96.36 | |
|
| MATH | 55.20 | 60.34 | 75.50 | 76.40 | 83.10 | 84.50 | |
|
| MMLU | 58.37 | 58.17 | 74.51 | 74.68 | 84.67 | 83.88 | |
|
| MMLU Pro | 32.40 | 33.78 | 56.30 | 54.50 | 71.10 | 66.10 | |
|
| Average | 57.33 | 61.94 | 76.99 | 76.40 | 84.91 | 84.02 | |
|
|
|
## Credits |
|
|
|
Thanks to NVIDIA team. |
|
|
|
--- |
|
license: cc-by-nc-4.0 |
|
--- |
|
|