LLMJapan
/

nvidia_AceInstruct-72B-exl2-4.0bpw

Text Generation

4-bit precision

Model card Files Files and versions Community

LLMJapan commited on 11 days ago

Commit

9f8a75a

·

verified ·

1 Parent(s): 6d2aa7b

Update README.md

Files changed (1) hide show

README.md +72 -3

README.md CHANGED Viewed

@@ -1,3 +1,72 @@
----
-license: cc-by-nc-4.0
----

+---
+quantized_by: LLMJapan
+pipeline_tag: text-generation
+license: cc-by-nc-4.0
+language:
+- en
+tags:
+- nvidia
+- AceInstruct
+- code
+- math
+- general_domain
+- instruct_model
+base_model: nvidia/AceInstruct-72B
+---
+## Exllama v2 Quantizations of AceInstruct-72B by nvidia
+Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.2.8">turboderp's ExLlamaV2 v0.2.8</a> for quantization.
+Original model: https://huggingface.co/nvidia/AceInstruct-72B
+Quantization Command Example for creating other bpw quantization
+```
+cd {your git clone directory}
+python convert.py -i {path to}/AceInstruct-72B -o {path to}/AceInstruct-72B/workingdir -cf {path to}/AceInstruct-72B/AceInstruct-72B-4bpw -b 4.0
+```
+## Prompt format
+```
+<|im_start|>system
+{system_prompt}<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+## How to add your system prompt
+Copy the following json and replace the "You are AceInstruct developed by NVIDIA. You are helpful assistant." sentence with your original system prompt.
+The default tokenizer_config.json does not have system prompt.
+tokenizer_config.json
+```
+"chat_template": "{{- '<|im_start|>system\\nYou are AceInstruct developed by NVIDIA. You are helpful assistant.<|im_end|>\\n' }}\n    {%- for message in messages %}\n{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n{{- '<|im_start|>assistant\n' }}\n{%- endif %}\n",
+```
+## File information
+| quantization type       |   file size |
+| ----------------------- | ----------: |
+| 4.0bpw                  |   35.9 GiB  |
+## Benchmark Results
+| | Qwen2.5-1.5B-Instruct | AceInstruct-1.5B | Qwen2.5-7B-Instruct | AceInstruct-7B | Qwen2.5-72B-Instruct | AceInstruct-72B |
+| --------- |:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|
+| HumanEval | 61.60 | 73.17 | 84.80 | 85.37 | 86.60 | 89.63 |
+| MBPP      | 63.20 | 65.76 | 79.20 | 74.32 | 88.20 | 83.66 |
+| GSM8K     | 73.20 | 80.44 | 91.60 | 93.10 | 95.80 | 96.36 |
+| MATH      | 55.20 | 60.34 | 75.50 | 76.40 | 83.10 | 84.50 |
+| MMLU      | 58.37 | 58.17 | 74.51 | 74.68 | 84.67 | 83.88 |
+| MMLU Pro  | 32.40 | 33.78 | 56.30 | 54.50 | 71.10 | 66.10 |
+| Average   | 57.33 | 61.94 | 76.99 | 76.40 | 84.91 | 84.02 |
+## Credits
+Thanks to NVIDIA team.
+---
+license: cc-by-nc-4.0
+---