LLMJapan
/

nvidia_AceInstruct-72B-exl2-3.0bpw

Text Generation

Model card Files Files and versions Community

nvidia_AceInstruct-72B-exl2-3.0bpw / README.md

LLMJapan's picture

Update README.md

0cec7e2 verified 11 days ago

|

history blame contribute delete

2.33 kB

	---
	quantized_by: LLMJapan
	pipeline_tag: text-generation
	license: cc-by-nc-4.0
	language:
	- en
	tags:
	- nvidia
	- AceInstruct
	- code
	- math
	- general_domain
	- instruct_model
	base_model: nvidia/AceInstruct-72B
	---
	## Exllama v2 Quantizations of AceInstruct-72B by nvidia

	Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.2.8">turboderp's ExLlamaV2 v0.2.8</a> for quantization.

	Original model: https://huggingface.co/nvidia/AceInstruct-72B

	Quantization Command Example for creating other bpw quantization
	```
	cd {your git clone directory}
	python convert.py -i {path to}/AceInstruct-72B -o {path to}/AceInstruct-72B/workingdir -cf {path to}/AceInstruct-72B/AceInstruct-72B-3bpw -b 3.0
	```

	## Prompt format

	```
	<\|im_start\|>system
	{system_prompt}<\|im_end\|>
	<\|im_start\|>user
	{prompt}<\|im_end\|>
	<\|im_start\|>assistant
	```

	## How to add your system prompt

	Copy the following json and replace the "You are AceInstruct developed by NVIDIA. You are helpful assistant." sentence with your original system prompt.
	The default tokenizer_config.json does not have system prompt.

	tokenizer_config.json
	```
	"chat_template": "{{- '<\|im_start\|>system\\nYou are AceInstruct developed by NVIDIA. You are helpful assistant.<\|im_end\|>\\n' }}\n {%- for message in messages %}\n{{- '<\|im_start\|>' + message.role + '\n' + message.content + '<\|im_end\|>' + '\n' }}\n{%- endfor %}\n{%- if add_generation_prompt %}\n{{- '<\|im_start\|>assistant\n' }}\n{%- endif %}\n",
	```

	## File information

	\| quantization type \| file size \|
	\| ----------------------- \| ----------: \|
	\| 3.0bpw \| 27.8 GiB \|

	## Benchmark Results

	\| \| Qwen2.5-1.5B-Instruct \| AceInstruct-1.5B \| Qwen2.5-7B-Instruct \| AceInstruct-7B \| Qwen2.5-72B-Instruct \| AceInstruct-72B \|
	\| --------- \|:-----:\|:-----:\|:-----:\|:-----:\|:-----:\|:-----:\|
	\| HumanEval \| 61.60 \| 73.17 \| 84.80 \| 85.37 \| 86.60 \| 89.63 \|
	\| MBPP \| 63.20 \| 65.76 \| 79.20 \| 74.32 \| 88.20 \| 83.66 \|
	\| GSM8K \| 73.20 \| 80.44 \| 91.60 \| 93.10 \| 95.80 \| 96.36 \|
	\| MATH \| 55.20 \| 60.34 \| 75.50 \| 76.40 \| 83.10 \| 84.50 \|
	\| MMLU \| 58.37 \| 58.17 \| 74.51 \| 74.68 \| 84.67 \| 83.88 \|
	\| MMLU Pro \| 32.40 \| 33.78 \| 56.30 \| 54.50 \| 71.10 \| 66.10 \|
	\| Average \| 57.33 \| 61.94 \| 76.99 \| 76.40 \| 84.91 \| 84.02 \|

	## Credits

	Thanks to NVIDIA team.

	---
	license: cc-by-nc-4.0
	---