Remove fork
Browse files- README.md +0 -248
- added_tokens.json +0 -24
- config.json +0 -34
- generation_config.json +0 -14
- gguf/Hammer2.1-7b-BF16.gguf +0 -3
- gguf/Hammer2.1-7b-Q4_K.gguf +0 -3
- gguf/Hammer2.1-7b-Q8_0.gguf +0 -3
- merges.txt +0 -0
- model-00001-of-00004.safetensors +0 -3
- model-00002-of-00004.safetensors +0 -3
- model-00003-of-00004.safetensors +0 -3
- model-00004-of-00004.safetensors +0 -3
- model.safetensors.index.json +0 -346
- special_tokens_map.json +0 -31
- tokenizer.json +0 -3
- tokenizer_config.json +0 -209
- v2_figures/bfcl.png +0 -0
- v2_figures/others-v2.png +0 -0
- vocab.json +0 -0
README.md
DELETED
@@ -1,248 +0,0 @@
|
|
1 |
-
---
|
2 |
-
license: cc-by-nc-4.0
|
3 |
-
datasets:
|
4 |
-
- Salesforce/xlam-function-calling-60k
|
5 |
-
- MadeAgents/xlam-irrelevance-7.5k
|
6 |
-
base_model:
|
7 |
-
- Qwen/Qwen2.5-Coder-7B-Instruct
|
8 |
-
---
|
9 |
-
|
10 |
-
# GGUF and Quantized versions of MadeAgents/Hammer2.1-7b
|
11 |
-
|
12 |
-
This is a fork of [MadeAgents/Hammer2.1-7b](https://huggingface.co/MadeAgents/Hammer2.1-7b) where the safetensors have been converted to GGUF and quantized to BF16, Q8_0, and Q4_K
|
13 |
-
|
14 |
-
This model seems to perform very well (given its size) in simple, multiple and parallel function call execution
|
15 |
-
|
16 |
-
From the original repo:
|
17 |
-
|
18 |
-
# Hammer2.1-7b Function Calling Model
|
19 |
-
|
20 |
-
## Introduction
|
21 |
-
|
22 |
-
Hammer refers to a series of lightweight Large Action Models. Currently, we are releasing Hammer 2.1 models ([0.5B](https://huggingface.co/MadeAgents/Hammer2.1-0.5b), [1.5B](https://huggingface.co/MadeAgents/Hammer2.1-1.5b), [3B](https://huggingface.co/MadeAgents/Hammer2.1-3b), and [7B](https://huggingface.co/MadeAgents/Hammer2.1-7b)) with strong function calling capability. These models are based on the Qwen 2.5 coder series and utilize [function masking techniques](https://arxiv.org/abs/2410.04587) and other advanced technologies. Hammer 2.1 series bring significant enhancements, while still maintaining the basic functionality of Hammer 2.0's Single-Turn interaction and further strengthening other capabilities.
|
23 |
-
|
24 |
-
## Model Details
|
25 |
-
The Hammer 2.1 models, fine-tuned from the Qwen 2.5 coder series, inherit Hammer 2.0's advantages and are enhanced as follows:
|
26 |
-
- <span style="color: red;">Multi-Step Function Calling:</span> The assistant can perform multiple internal function calls to handle a single user request, actively planning and gathering information to fulfill complex tasks.
|
27 |
-
- <span style="color: red;">Multi-Turn Function Calling:</span> Enables continuous and context-aware interactions over multiple exchanges, with each turn potentially containing multiple steps, for a more natural conversation experience.
|
28 |
-
- Enhanced Irrelevant Information Inspection: Better at identifying when provided functions are irrelevant to a user query, by providing a non-function call response.
|
29 |
-
|
30 |
-
## Evaluation
|
31 |
-
The evaluation results of Hammer 2.1 models on the Berkeley Function-Calling Leaderboard (BFCL-v3) are presented in the following table:
|
32 |
-
<div style="text-align: center;">
|
33 |
-
<img src="v2_figures/bfcl.png" alt="overview" width="1000" style="margin: auto;">
|
34 |
-
</div>
|
35 |
-
|
36 |
-
Our Hammer 2.1 series consistently achieves corresponding best performance at comparable scales. The 7B/3B/1.5B model outperform most function calling enchanced models.
|
37 |
-
|
38 |
-
In addition, we evaluated the Hammer 2.1 models on other academic benchmarks to further demonstrate the generalization ability of our models.
|
39 |
-
|
40 |
-
<div style="text-align: center;">
|
41 |
-
<img src="v2_figures/others-v2.png" alt="overview" width="1000" style="margin: auto;">
|
42 |
-
</div>
|
43 |
-
|
44 |
-
Hammer 2.1 models showcase highly stable performance, suggesting the robustness of Hammer 2.1 series. In contrast, the baseline approaches display varying levels of effectiveness.
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
## Requiements
|
50 |
-
The code of Hammer 2.1 models have been in the latest Hugging face transformers and we advise you to install `transformers>=4.47.0`.
|
51 |
-
|
52 |
-
## How to Use
|
53 |
-
Hammer models offer flexibility in deployment and usage, fully supporting both **vLLM** deployment and **Hugging Face Transformers** tool calling. Below are the specifics on how to make use of these features:
|
54 |
-
|
55 |
-
### Using vLLM
|
56 |
-
|
57 |
-
#### Option 1: Using Hammer client (Recommended)
|
58 |
-
|
59 |
-
Before using vLLM, first clone the Hammer code repository and change directory to the 'Hammer':
|
60 |
-
```
|
61 |
-
git clone https://github.com/MadeAgents/Hammer.git
|
62 |
-
cd Hammer
|
63 |
-
```
|
64 |
-
|
65 |
-
vLLM offers efficient serving with lower latency. To serve the model with vLLM:
|
66 |
-
```
|
67 |
-
vllm serve MadeAgents/Hammer2.1-7b --host 0.0.0.0 --port 8000 --tensor-parallel-size 1
|
68 |
-
```
|
69 |
-
Once the model is served, you can use the following Hammer client to interact with it for function calling:
|
70 |
-
~~~
|
71 |
-
from client import HammerChatCompletion,HammerConfig
|
72 |
-
config = HammerConfig(base_url="http://localhost:8000/v1/", model="MadeAgents/Hammer2.1-7b")
|
73 |
-
llm = HammerChatCompletion.from_config(config)
|
74 |
-
|
75 |
-
# Example conversation
|
76 |
-
messages = [
|
77 |
-
{"role": "user", "content": "What's the weather like in New York?"},
|
78 |
-
{"role": "assistant","content": '```\n{"name": "get_weather", "arguments": {"location": "New York, NY ", "unit": "celsius"}\n```'},
|
79 |
-
{"role": "tool", "name": "get_weather", "content": '{"temperature": 72, "description": "Partly cloudy"}'},
|
80 |
-
{"role": "user", "content": "Now, search for the weather in San Francisco."}
|
81 |
-
]
|
82 |
-
|
83 |
-
# Example function definition (optional)
|
84 |
-
tools = [
|
85 |
-
{
|
86 |
-
"name": "get_weather",
|
87 |
-
"description": "Get the current weather for a location",
|
88 |
-
"parameters": {
|
89 |
-
"type": "object",
|
90 |
-
"properties": {
|
91 |
-
"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"},
|
92 |
-
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature to return"}
|
93 |
-
},
|
94 |
-
"required": ["location"]
|
95 |
-
}
|
96 |
-
},
|
97 |
-
{
|
98 |
-
"name": "respond",
|
99 |
-
"description": "When you are ready to respond, use this function. This function allows the assistant to formulate and deliver appropriate replies based on the input message and the context of the conversation. Generate a concise response for simple questions, and a more detailed response for complex questions.",
|
100 |
-
"parameters": {
|
101 |
-
"type": "object",
|
102 |
-
"properties": {
|
103 |
-
"message": {"type": "string", "description": "The content of the message to respond to."}
|
104 |
-
},
|
105 |
-
"required": ["message"]
|
106 |
-
}
|
107 |
-
}
|
108 |
-
]
|
109 |
-
|
110 |
-
response = llm.completion(messages, tools=tools)
|
111 |
-
print(response)
|
112 |
-
~~~
|
113 |
-
|
114 |
-
|
115 |
-
#### Option 2: Using vLLM’s built-in tool calling
|
116 |
-
Hammer2.1 supports vllm’s built-in tool calling. This functionality requires vllm>=0.6. If you want to enable this functionality, please start vllm’s OpenAI-compatible service with:
|
117 |
-
~~~
|
118 |
-
vllm serve MadeAgents/Hammer2.1-7b --enable-auto-tool-choice --tool-call-parser hermes
|
119 |
-
~~~
|
120 |
-
And then use it in the same way you use GPT’s tool calling:
|
121 |
-
~~~
|
122 |
-
tools = [
|
123 |
-
{
|
124 |
-
"type": "function",
|
125 |
-
"function": {
|
126 |
-
"name": "get_current_weather",
|
127 |
-
"description": "Get the current weather",
|
128 |
-
"parameters": {
|
129 |
-
"type": "object",
|
130 |
-
"properties": {
|
131 |
-
"location": {
|
132 |
-
"type": "string",
|
133 |
-
"description": "The city and state, e.g. San Francisco, CA",
|
134 |
-
},
|
135 |
-
"format": {
|
136 |
-
"type": "string",
|
137 |
-
"enum": ["celsius", "fahrenheit"],
|
138 |
-
"description": "The temperature unit to use. Infer this from the users location.",
|
139 |
-
"default": "celsius"
|
140 |
-
},
|
141 |
-
},
|
142 |
-
"required": ["location","format"],
|
143 |
-
},
|
144 |
-
}
|
145 |
-
},
|
146 |
-
{
|
147 |
-
"type": "function",
|
148 |
-
"function": {
|
149 |
-
"name": "get_n_day_weather_forecast",
|
150 |
-
"description": "Get an N-day weather forecast",
|
151 |
-
"parameters": {
|
152 |
-
"type": "object",
|
153 |
-
"properties": {
|
154 |
-
"location": {
|
155 |
-
"type": "string",
|
156 |
-
"description": "The city and state, e.g. San Francisco, CA",
|
157 |
-
},
|
158 |
-
"format": {
|
159 |
-
"type": "string",
|
160 |
-
"enum": ["celsius", "fahrenheit"],
|
161 |
-
"description": "The temperature unit to use. Infer this from the users location.",
|
162 |
-
"default": "celsius"
|
163 |
-
},
|
164 |
-
"num_days": {
|
165 |
-
"type": "integer",
|
166 |
-
"description": "The number of days to forecast",
|
167 |
-
"default": 1
|
168 |
-
}
|
169 |
-
},
|
170 |
-
"required": ["location", "format", "num_days"]
|
171 |
-
},
|
172 |
-
}
|
173 |
-
},
|
174 |
-
]
|
175 |
-
|
176 |
-
|
177 |
-
from openai import OpenAI
|
178 |
-
openai_api_key = "None"
|
179 |
-
openai_api_base = "http://localhost:8000/v1"
|
180 |
-
|
181 |
-
client = OpenAI(
|
182 |
-
api_key=openai_api_key,
|
183 |
-
base_url=openai_api_base,
|
184 |
-
)
|
185 |
-
|
186 |
-
query = """What's the weather like today in San Francisco"""
|
187 |
-
|
188 |
-
chat_response = client.chat.completions.create(
|
189 |
-
model="MadeAgents/Hammer2.1-7b",
|
190 |
-
messages=[
|
191 |
-
{"role": "user", "content": query},],
|
192 |
-
tools = tools,
|
193 |
-
temperature=0
|
194 |
-
)
|
195 |
-
print(chat_response.choices[0].message.content)
|
196 |
-
~~~
|
197 |
-
|
198 |
-
|
199 |
-
### Using Hugging Face Transformers
|
200 |
-
Hammer2.1’s chat template also includes a tool calling template, meaning that you can use Hugging Face transformers’ tool calling support. This is a simple example of how to use our model using Transformers.
|
201 |
-
~~~
|
202 |
-
import torch
|
203 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
204 |
-
|
205 |
-
|
206 |
-
tokenizer = AutoTokenizer.from_pretrained("MadeAgents/Hammer2.1-7b")
|
207 |
-
model = AutoModelForCausalLM.from_pretrained("MadeAgents/Hammer2.1-7b", torch_dtype=torch.bfloat16, device_map="auto")
|
208 |
-
|
209 |
-
# Example conversation
|
210 |
-
messages = [
|
211 |
-
{"role": "user", "content": "What's the weather like in New York?"},
|
212 |
-
{"role": "assistant","content": '```\n{"name": "get_weather", "arguments": {"location": "New York, NY ", "unit": "celsius"}\n```'},
|
213 |
-
{"role": "tool", "name": "get_weather", "content": '{"temperature": 72, "description": "Partly cloudy"}'},
|
214 |
-
{"role": "user", "content": "Now, search for the weather in San Francisco."}
|
215 |
-
]
|
216 |
-
|
217 |
-
# Example function definition (optional)
|
218 |
-
tools = [
|
219 |
-
{
|
220 |
-
"name": "get_weather",
|
221 |
-
"description": "Get the current weather for a location",
|
222 |
-
"parameters": {
|
223 |
-
"type": "object",
|
224 |
-
"properties": {
|
225 |
-
"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"},
|
226 |
-
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature to return"}
|
227 |
-
},
|
228 |
-
"required": ["location"]
|
229 |
-
}
|
230 |
-
},
|
231 |
-
{
|
232 |
-
"name": "respond",
|
233 |
-
"description": "When you are ready to respond, use this function. This function allows the assistant to formulate and deliver appropriate replies based on the input message and the context of the conversation. Generate a concise response for simple questions, and a more detailed response for complex questions.",
|
234 |
-
"parameters": {
|
235 |
-
"type": "object",
|
236 |
-
"properties": {
|
237 |
-
"message": {"type": "string", "description": "The content of the message to respond to."}
|
238 |
-
},
|
239 |
-
"required": ["message"]
|
240 |
-
}
|
241 |
-
}
|
242 |
-
]
|
243 |
-
|
244 |
-
inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
|
245 |
-
inputs = {k: v.to(model.device) for k, v in inputs.items()}
|
246 |
-
out = model.generate(**inputs, max_new_tokens=128)
|
247 |
-
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):], skip_special_tokens=True))
|
248 |
-
~~~
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
added_tokens.json
DELETED
@@ -1,24 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"</tool_call>": 151658,
|
3 |
-
"<tool_call>": 151657,
|
4 |
-
"<|box_end|>": 151649,
|
5 |
-
"<|box_start|>": 151648,
|
6 |
-
"<|endoftext|>": 151643,
|
7 |
-
"<|file_sep|>": 151664,
|
8 |
-
"<|fim_middle|>": 151660,
|
9 |
-
"<|fim_pad|>": 151662,
|
10 |
-
"<|fim_prefix|>": 151659,
|
11 |
-
"<|fim_suffix|>": 151661,
|
12 |
-
"<|im_end|>": 151645,
|
13 |
-
"<|im_start|>": 151644,
|
14 |
-
"<|image_pad|>": 151655,
|
15 |
-
"<|object_ref_end|>": 151647,
|
16 |
-
"<|object_ref_start|>": 151646,
|
17 |
-
"<|quad_end|>": 151651,
|
18 |
-
"<|quad_start|>": 151650,
|
19 |
-
"<|repo_name|>": 151663,
|
20 |
-
"<|video_pad|>": 151656,
|
21 |
-
"<|vision_end|>": 151653,
|
22 |
-
"<|vision_pad|>": 151654,
|
23 |
-
"<|vision_start|>": 151652
|
24 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
config.json
DELETED
@@ -1,34 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_name_or_path": "/home/notebook/data/group/ComplexTaskDecision/Hammer/ckpt/select_caller/hammer2.1/hammer2.1-7b",
|
3 |
-
"architectures": [
|
4 |
-
"Qwen2ForCausalLM"
|
5 |
-
],
|
6 |
-
"attention_dropout": 0.0,
|
7 |
-
"bos_token_id": 151643,
|
8 |
-
"eos_token_id": 151643,
|
9 |
-
"hidden_act": "silu",
|
10 |
-
"hidden_size": 3584,
|
11 |
-
"initializer_range": 0.02,
|
12 |
-
"intermediate_size": 18944,
|
13 |
-
"max_position_embeddings": 32768,
|
14 |
-
"max_window_layers": 28,
|
15 |
-
"model_type": "qwen2",
|
16 |
-
"num_attention_heads": 28,
|
17 |
-
"num_hidden_layers": 28,
|
18 |
-
"num_key_value_heads": 4,
|
19 |
-
"rms_norm_eps": 1e-06,
|
20 |
-
"rope_scaling": {
|
21 |
-
"factor": 4.0,
|
22 |
-
"original_max_position_embeddings": 32768,
|
23 |
-
"rope_type": "yarn",
|
24 |
-
"type": "yarn"
|
25 |
-
},
|
26 |
-
"rope_theta": 1000000.0,
|
27 |
-
"sliding_window": null,
|
28 |
-
"tie_word_embeddings": false,
|
29 |
-
"torch_dtype": "bfloat16",
|
30 |
-
"transformers_version": "4.47.0",
|
31 |
-
"use_cache": true,
|
32 |
-
"use_sliding_window": false,
|
33 |
-
"vocab_size": 151665
|
34 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
generation_config.json
DELETED
@@ -1,14 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"bos_token_id": 151643,
|
3 |
-
"do_sample": true,
|
4 |
-
"eos_token_id": [
|
5 |
-
151645,
|
6 |
-
151643
|
7 |
-
],
|
8 |
-
"pad_token_id": 151643,
|
9 |
-
"repetition_penalty": 1.1,
|
10 |
-
"temperature": 0.7,
|
11 |
-
"top_k": 20,
|
12 |
-
"top_p": 0.8,
|
13 |
-
"transformers_version": "4.47.0"
|
14 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
gguf/Hammer2.1-7b-BF16.gguf
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:dfdee52012809f6ff5d5bfa714a8d79152c412486741f25c433526a9ce064bc7
|
3 |
-
size 15232124640
|
|
|
|
|
|
|
|
gguf/Hammer2.1-7b-Q4_K.gguf
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:e2c42b22cc96c0e9e48a70ab2b78a9c8c0d675058d1047c59768c216049f59b4
|
3 |
-
size 4681087616
|
|
|
|
|
|
|
|
gguf/Hammer2.1-7b-Q8_0.gguf
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:6bc90427d5a7ad7c7f973dd6ef1b65e9c578f8597295051d05edb9fd0cbbde18
|
3 |
-
size 8095477920
|
|
|
|
|
|
|
|
merges.txt
DELETED
The diff for this file is too large to render.
See raw diff
|
|
model-00001-of-00004.safetensors
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:26bf1ddd22082a8115e3a2196af9ff854907533748da8e89bfa4971b0aca6623
|
3 |
-
size 4874800744
|
|
|
|
|
|
|
|
model-00002-of-00004.safetensors
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:bc7791b36f34cdb1ad94583bc7e00a49c33f4c655520e6cc87400cf75d96e3f7
|
3 |
-
size 4932751008
|
|
|
|
|
|
|
|
model-00003-of-00004.safetensors
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:421136e2c37dd29190483b54148ff77a060c4527276e414fa04ca1729f7395ce
|
3 |
-
size 4330865200
|
|
|
|
|
|
|
|
model-00004-of-00004.safetensors
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:5bd8b021f2bc6717d961ff3df97fa989c2ef6d8ea34738168cd1278b38a0a672
|
3 |
-
size 1087134848
|
|
|
|
|
|
|
|
model.safetensors.index.json
DELETED
@@ -1,346 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"metadata": {
|
3 |
-
"total_size": 15225512960
|
4 |
-
},
|
5 |
-
"weight_map": {
|
6 |
-
"lm_head.weight": "model-00004-of-00004.safetensors",
|
7 |
-
"model.embed_tokens.weight": "model-00001-of-00004.safetensors",
|
8 |
-
"model.layers.0.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
9 |
-
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
10 |
-
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
11 |
-
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
12 |
-
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
13 |
-
"model.layers.0.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
14 |
-
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
15 |
-
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
16 |
-
"model.layers.0.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
17 |
-
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
18 |
-
"model.layers.0.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
19 |
-
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
20 |
-
"model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
21 |
-
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
22 |
-
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
23 |
-
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
24 |
-
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
25 |
-
"model.layers.1.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
26 |
-
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
27 |
-
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
28 |
-
"model.layers.1.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
29 |
-
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
30 |
-
"model.layers.1.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
31 |
-
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
32 |
-
"model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
33 |
-
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
34 |
-
"model.layers.10.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
35 |
-
"model.layers.10.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
36 |
-
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
37 |
-
"model.layers.10.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
38 |
-
"model.layers.10.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
39 |
-
"model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
40 |
-
"model.layers.10.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
41 |
-
"model.layers.10.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
42 |
-
"model.layers.10.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
43 |
-
"model.layers.10.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
44 |
-
"model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
45 |
-
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
46 |
-
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
47 |
-
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
48 |
-
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
49 |
-
"model.layers.11.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
50 |
-
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
51 |
-
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
52 |
-
"model.layers.11.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
53 |
-
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
54 |
-
"model.layers.11.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
55 |
-
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
56 |
-
"model.layers.12.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
57 |
-
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
58 |
-
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
59 |
-
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
60 |
-
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
61 |
-
"model.layers.12.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
62 |
-
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
63 |
-
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
64 |
-
"model.layers.12.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
65 |
-
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
66 |
-
"model.layers.12.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
67 |
-
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
68 |
-
"model.layers.13.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
69 |
-
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
70 |
-
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
71 |
-
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
72 |
-
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
73 |
-
"model.layers.13.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
74 |
-
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
75 |
-
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
76 |
-
"model.layers.13.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
77 |
-
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
78 |
-
"model.layers.13.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
79 |
-
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
80 |
-
"model.layers.14.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
81 |
-
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
82 |
-
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
83 |
-
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
84 |
-
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
85 |
-
"model.layers.14.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
86 |
-
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
87 |
-
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
88 |
-
"model.layers.14.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
89 |
-
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
90 |
-
"model.layers.14.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
91 |
-
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
92 |
-
"model.layers.15.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
93 |
-
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
94 |
-
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
95 |
-
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
96 |
-
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
97 |
-
"model.layers.15.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
98 |
-
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
99 |
-
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
100 |
-
"model.layers.15.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
101 |
-
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
102 |
-
"model.layers.15.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
103 |
-
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
104 |
-
"model.layers.16.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
105 |
-
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
106 |
-
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
107 |
-
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
108 |
-
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
109 |
-
"model.layers.16.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
110 |
-
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
111 |
-
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
112 |
-
"model.layers.16.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
113 |
-
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
114 |
-
"model.layers.16.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
115 |
-
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
116 |
-
"model.layers.17.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
117 |
-
"model.layers.17.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
118 |
-
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
119 |
-
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
120 |
-
"model.layers.17.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
121 |
-
"model.layers.17.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
122 |
-
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
123 |
-
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
124 |
-
"model.layers.17.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
125 |
-
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
126 |
-
"model.layers.17.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
127 |
-
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
128 |
-
"model.layers.18.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
129 |
-
"model.layers.18.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
130 |
-
"model.layers.18.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
131 |
-
"model.layers.18.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
132 |
-
"model.layers.18.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
133 |
-
"model.layers.18.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
134 |
-
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
135 |
-
"model.layers.18.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
136 |
-
"model.layers.18.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
137 |
-
"model.layers.18.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
138 |
-
"model.layers.18.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
139 |
-
"model.layers.18.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
140 |
-
"model.layers.19.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
141 |
-
"model.layers.19.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
142 |
-
"model.layers.19.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
143 |
-
"model.layers.19.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
144 |
-
"model.layers.19.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
145 |
-
"model.layers.19.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
146 |
-
"model.layers.19.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
147 |
-
"model.layers.19.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
148 |
-
"model.layers.19.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
149 |
-
"model.layers.19.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
150 |
-
"model.layers.19.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
151 |
-
"model.layers.19.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
152 |
-
"model.layers.2.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
153 |
-
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
154 |
-
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
155 |
-
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
156 |
-
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
157 |
-
"model.layers.2.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
158 |
-
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
159 |
-
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
160 |
-
"model.layers.2.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
161 |
-
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
162 |
-
"model.layers.2.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
163 |
-
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
164 |
-
"model.layers.20.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
165 |
-
"model.layers.20.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
166 |
-
"model.layers.20.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
167 |
-
"model.layers.20.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
168 |
-
"model.layers.20.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
169 |
-
"model.layers.20.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
170 |
-
"model.layers.20.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
171 |
-
"model.layers.20.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
172 |
-
"model.layers.20.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
173 |
-
"model.layers.20.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
174 |
-
"model.layers.20.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
175 |
-
"model.layers.20.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
176 |
-
"model.layers.21.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
177 |
-
"model.layers.21.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
178 |
-
"model.layers.21.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
179 |
-
"model.layers.21.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
180 |
-
"model.layers.21.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
181 |
-
"model.layers.21.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
182 |
-
"model.layers.21.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
183 |
-
"model.layers.21.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
184 |
-
"model.layers.21.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
185 |
-
"model.layers.21.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
186 |
-
"model.layers.21.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
187 |
-
"model.layers.21.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
188 |
-
"model.layers.22.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
189 |
-
"model.layers.22.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
190 |
-
"model.layers.22.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
191 |
-
"model.layers.22.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
192 |
-
"model.layers.22.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
193 |
-
"model.layers.22.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
194 |
-
"model.layers.22.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
195 |
-
"model.layers.22.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
196 |
-
"model.layers.22.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
197 |
-
"model.layers.22.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
198 |
-
"model.layers.22.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
199 |
-
"model.layers.22.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
200 |
-
"model.layers.23.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
201 |
-
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
202 |
-
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
203 |
-
"model.layers.23.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
204 |
-
"model.layers.23.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
205 |
-
"model.layers.23.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
206 |
-
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
207 |
-
"model.layers.23.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
208 |
-
"model.layers.23.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
209 |
-
"model.layers.23.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
210 |
-
"model.layers.23.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
211 |
-
"model.layers.23.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
212 |
-
"model.layers.24.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
213 |
-
"model.layers.24.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
214 |
-
"model.layers.24.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
215 |
-
"model.layers.24.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
216 |
-
"model.layers.24.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
217 |
-
"model.layers.24.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
218 |
-
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
219 |
-
"model.layers.24.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
220 |
-
"model.layers.24.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
221 |
-
"model.layers.24.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
222 |
-
"model.layers.24.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
223 |
-
"model.layers.24.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
224 |
-
"model.layers.25.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
225 |
-
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
226 |
-
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
227 |
-
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
228 |
-
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
229 |
-
"model.layers.25.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
230 |
-
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
231 |
-
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
232 |
-
"model.layers.25.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
233 |
-
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
234 |
-
"model.layers.25.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
235 |
-
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
236 |
-
"model.layers.26.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
237 |
-
"model.layers.26.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
238 |
-
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
239 |
-
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
240 |
-
"model.layers.26.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
241 |
-
"model.layers.26.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
242 |
-
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
243 |
-
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
244 |
-
"model.layers.26.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
245 |
-
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
246 |
-
"model.layers.26.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
247 |
-
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
248 |
-
"model.layers.27.input_layernorm.weight": "model-00003-of-00004.safetensors",
|
249 |
-
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
|
250 |
-
"model.layers.27.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
|
251 |
-
"model.layers.27.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
|
252 |
-
"model.layers.27.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
|
253 |
-
"model.layers.27.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
|
254 |
-
"model.layers.27.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
|
255 |
-
"model.layers.27.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
|
256 |
-
"model.layers.27.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
|
257 |
-
"model.layers.27.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
|
258 |
-
"model.layers.27.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
|
259 |
-
"model.layers.27.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
|
260 |
-
"model.layers.3.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
261 |
-
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
262 |
-
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
263 |
-
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
264 |
-
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
265 |
-
"model.layers.3.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
266 |
-
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
267 |
-
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
268 |
-
"model.layers.3.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
269 |
-
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
270 |
-
"model.layers.3.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
271 |
-
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
272 |
-
"model.layers.4.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
273 |
-
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
274 |
-
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
275 |
-
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
276 |
-
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
277 |
-
"model.layers.4.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
278 |
-
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
279 |
-
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
280 |
-
"model.layers.4.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
281 |
-
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
282 |
-
"model.layers.4.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
283 |
-
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
284 |
-
"model.layers.5.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
285 |
-
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
286 |
-
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
287 |
-
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
288 |
-
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
289 |
-
"model.layers.5.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
290 |
-
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
291 |
-
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
292 |
-
"model.layers.5.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
293 |
-
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
294 |
-
"model.layers.5.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
295 |
-
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
296 |
-
"model.layers.6.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
297 |
-
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
298 |
-
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
299 |
-
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
300 |
-
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
301 |
-
"model.layers.6.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
302 |
-
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
303 |
-
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
304 |
-
"model.layers.6.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
305 |
-
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
306 |
-
"model.layers.6.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
307 |
-
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
308 |
-
"model.layers.7.input_layernorm.weight": "model-00001-of-00004.safetensors",
|
309 |
-
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
|
310 |
-
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
|
311 |
-
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
|
312 |
-
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
|
313 |
-
"model.layers.7.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
314 |
-
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
315 |
-
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
316 |
-
"model.layers.7.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
317 |
-
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
318 |
-
"model.layers.7.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
319 |
-
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
320 |
-
"model.layers.8.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
321 |
-
"model.layers.8.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
322 |
-
"model.layers.8.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
323 |
-
"model.layers.8.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
324 |
-
"model.layers.8.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
325 |
-
"model.layers.8.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
|
326 |
-
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
|
327 |
-
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
|
328 |
-
"model.layers.8.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
|
329 |
-
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
|
330 |
-
"model.layers.8.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
|
331 |
-
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
|
332 |
-
"model.layers.9.input_layernorm.weight": "model-00002-of-00004.safetensors",
|
333 |
-
"model.layers.9.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
|
334 |
-
"model.layers.9.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
|
335 |
-
"model.layers.9.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
|
336 |
-
"model.layers.9.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
|
337 |
-
"model.layers.9.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
|
338 |
-
"model.layers.9.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
|
339 |
-
"model.layers.9.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
|
340 |
-
"model.layers.9.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
|
341 |
-
"model.layers.9.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
|
342 |
-
"model.layers.9.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
|
343 |
-
"model.layers.9.self_attn.v_proj.weight": "model-00002-of-00004.safetensors",
|
344 |
-
"model.norm.weight": "model-00003-of-00004.safetensors"
|
345 |
-
}
|
346 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
special_tokens_map.json
DELETED
@@ -1,31 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"additional_special_tokens": [
|
3 |
-
"<|im_start|>",
|
4 |
-
"<|im_end|>",
|
5 |
-
"<|object_ref_start|>",
|
6 |
-
"<|object_ref_end|>",
|
7 |
-
"<|box_start|>",
|
8 |
-
"<|box_end|>",
|
9 |
-
"<|quad_start|>",
|
10 |
-
"<|quad_end|>",
|
11 |
-
"<|vision_start|>",
|
12 |
-
"<|vision_end|>",
|
13 |
-
"<|vision_pad|>",
|
14 |
-
"<|image_pad|>",
|
15 |
-
"<|video_pad|>"
|
16 |
-
],
|
17 |
-
"eos_token": {
|
18 |
-
"content": "<|im_end|>",
|
19 |
-
"lstrip": false,
|
20 |
-
"normalized": false,
|
21 |
-
"rstrip": false,
|
22 |
-
"single_word": false
|
23 |
-
},
|
24 |
-
"pad_token": {
|
25 |
-
"content": "<|endoftext|>",
|
26 |
-
"lstrip": false,
|
27 |
-
"normalized": false,
|
28 |
-
"rstrip": false,
|
29 |
-
"single_word": false
|
30 |
-
}
|
31 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
tokenizer.json
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:9c5ae00e602b8860cbd784ba82a8aa14e8feecec692e7076590d014d7b7fdafa
|
3 |
-
size 11421896
|
|
|
|
|
|
|
|
tokenizer_config.json
DELETED
@@ -1,209 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"add_bos_token": false,
|
3 |
-
"add_prefix_space": false,
|
4 |
-
"added_tokens_decoder": {
|
5 |
-
"151643": {
|
6 |
-
"content": "<|endoftext|>",
|
7 |
-
"lstrip": false,
|
8 |
-
"normalized": false,
|
9 |
-
"rstrip": false,
|
10 |
-
"single_word": false,
|
11 |
-
"special": true
|
12 |
-
},
|
13 |
-
"151644": {
|
14 |
-
"content": "<|im_start|>",
|
15 |
-
"lstrip": false,
|
16 |
-
"normalized": false,
|
17 |
-
"rstrip": false,
|
18 |
-
"single_word": false,
|
19 |
-
"special": true
|
20 |
-
},
|
21 |
-
"151645": {
|
22 |
-
"content": "<|im_end|>",
|
23 |
-
"lstrip": false,
|
24 |
-
"normalized": false,
|
25 |
-
"rstrip": false,
|
26 |
-
"single_word": false,
|
27 |
-
"special": true
|
28 |
-
},
|
29 |
-
"151646": {
|
30 |
-
"content": "<|object_ref_start|>",
|
31 |
-
"lstrip": false,
|
32 |
-
"normalized": false,
|
33 |
-
"rstrip": false,
|
34 |
-
"single_word": false,
|
35 |
-
"special": true
|
36 |
-
},
|
37 |
-
"151647": {
|
38 |
-
"content": "<|object_ref_end|>",
|
39 |
-
"lstrip": false,
|
40 |
-
"normalized": false,
|
41 |
-
"rstrip": false,
|
42 |
-
"single_word": false,
|
43 |
-
"special": true
|
44 |
-
},
|
45 |
-
"151648": {
|
46 |
-
"content": "<|box_start|>",
|
47 |
-
"lstrip": false,
|
48 |
-
"normalized": false,
|
49 |
-
"rstrip": false,
|
50 |
-
"single_word": false,
|
51 |
-
"special": true
|
52 |
-
},
|
53 |
-
"151649": {
|
54 |
-
"content": "<|box_end|>",
|
55 |
-
"lstrip": false,
|
56 |
-
"normalized": false,
|
57 |
-
"rstrip": false,
|
58 |
-
"single_word": false,
|
59 |
-
"special": true
|
60 |
-
},
|
61 |
-
"151650": {
|
62 |
-
"content": "<|quad_start|>",
|
63 |
-
"lstrip": false,
|
64 |
-
"normalized": false,
|
65 |
-
"rstrip": false,
|
66 |
-
"single_word": false,
|
67 |
-
"special": true
|
68 |
-
},
|
69 |
-
"151651": {
|
70 |
-
"content": "<|quad_end|>",
|
71 |
-
"lstrip": false,
|
72 |
-
"normalized": false,
|
73 |
-
"rstrip": false,
|
74 |
-
"single_word": false,
|
75 |
-
"special": true
|
76 |
-
},
|
77 |
-
"151652": {
|
78 |
-
"content": "<|vision_start|>",
|
79 |
-
"lstrip": false,
|
80 |
-
"normalized": false,
|
81 |
-
"rstrip": false,
|
82 |
-
"single_word": false,
|
83 |
-
"special": true
|
84 |
-
},
|
85 |
-
"151653": {
|
86 |
-
"content": "<|vision_end|>",
|
87 |
-
"lstrip": false,
|
88 |
-
"normalized": false,
|
89 |
-
"rstrip": false,
|
90 |
-
"single_word": false,
|
91 |
-
"special": true
|
92 |
-
},
|
93 |
-
"151654": {
|
94 |
-
"content": "<|vision_pad|>",
|
95 |
-
"lstrip": false,
|
96 |
-
"normalized": false,
|
97 |
-
"rstrip": false,
|
98 |
-
"single_word": false,
|
99 |
-
"special": true
|
100 |
-
},
|
101 |
-
"151655": {
|
102 |
-
"content": "<|image_pad|>",
|
103 |
-
"lstrip": false,
|
104 |
-
"normalized": false,
|
105 |
-
"rstrip": false,
|
106 |
-
"single_word": false,
|
107 |
-
"special": true
|
108 |
-
},
|
109 |
-
"151656": {
|
110 |
-
"content": "<|video_pad|>",
|
111 |
-
"lstrip": false,
|
112 |
-
"normalized": false,
|
113 |
-
"rstrip": false,
|
114 |
-
"single_word": false,
|
115 |
-
"special": true
|
116 |
-
},
|
117 |
-
"151657": {
|
118 |
-
"content": "<tool_call>",
|
119 |
-
"lstrip": false,
|
120 |
-
"normalized": false,
|
121 |
-
"rstrip": false,
|
122 |
-
"single_word": false,
|
123 |
-
"special": false
|
124 |
-
},
|
125 |
-
"151658": {
|
126 |
-
"content": "</tool_call>",
|
127 |
-
"lstrip": false,
|
128 |
-
"normalized": false,
|
129 |
-
"rstrip": false,
|
130 |
-
"single_word": false,
|
131 |
-
"special": false
|
132 |
-
},
|
133 |
-
"151659": {
|
134 |
-
"content": "<|fim_prefix|>",
|
135 |
-
"lstrip": false,
|
136 |
-
"normalized": false,
|
137 |
-
"rstrip": false,
|
138 |
-
"single_word": false,
|
139 |
-
"special": false
|
140 |
-
},
|
141 |
-
"151660": {
|
142 |
-
"content": "<|fim_middle|>",
|
143 |
-
"lstrip": false,
|
144 |
-
"normalized": false,
|
145 |
-
"rstrip": false,
|
146 |
-
"single_word": false,
|
147 |
-
"special": false
|
148 |
-
},
|
149 |
-
"151661": {
|
150 |
-
"content": "<|fim_suffix|>",
|
151 |
-
"lstrip": false,
|
152 |
-
"normalized": false,
|
153 |
-
"rstrip": false,
|
154 |
-
"single_word": false,
|
155 |
-
"special": false
|
156 |
-
},
|
157 |
-
"151662": {
|
158 |
-
"content": "<|fim_pad|>",
|
159 |
-
"lstrip": false,
|
160 |
-
"normalized": false,
|
161 |
-
"rstrip": false,
|
162 |
-
"single_word": false,
|
163 |
-
"special": false
|
164 |
-
},
|
165 |
-
"151663": {
|
166 |
-
"content": "<|repo_name|>",
|
167 |
-
"lstrip": false,
|
168 |
-
"normalized": false,
|
169 |
-
"rstrip": false,
|
170 |
-
"single_word": false,
|
171 |
-
"special": false
|
172 |
-
},
|
173 |
-
"151664": {
|
174 |
-
"content": "<|file_sep|>",
|
175 |
-
"lstrip": false,
|
176 |
-
"normalized": false,
|
177 |
-
"rstrip": false,
|
178 |
-
"single_word": false,
|
179 |
-
"special": false
|
180 |
-
}
|
181 |
-
},
|
182 |
-
"additional_special_tokens": [
|
183 |
-
"<|im_start|>",
|
184 |
-
"<|im_end|>",
|
185 |
-
"<|object_ref_start|>",
|
186 |
-
"<|object_ref_end|>",
|
187 |
-
"<|box_start|>",
|
188 |
-
"<|box_end|>",
|
189 |
-
"<|quad_start|>",
|
190 |
-
"<|quad_end|>",
|
191 |
-
"<|vision_start|>",
|
192 |
-
"<|vision_end|>",
|
193 |
-
"<|vision_pad|>",
|
194 |
-
"<|image_pad|>",
|
195 |
-
"<|video_pad|>"
|
196 |
-
],
|
197 |
-
"bos_token": null,
|
198 |
-
"chat_template": "{%- set system_message = 'You are a helpful assistant.' %}\n{%- if messages[0]['role'] == 'system' %}\n {%- set system_message = messages[0]['content'] %}\n {%- if messages[1]['role'] == 'system' %}\n {%- set format_message = messages[1]['content'] %}\n {%- set loop_messages = messages[2:] %}\n {%- else %}\n {%- set loop_messages = messages[1:] %}\n {%- endif %}\n{%- else %}\n {%- set loop_messages = messages %}\n{%- endif %}\n{%- if not tools is defined %}\n {%- set tools = none %}\n{%- endif %}\n{%- if system_message is defined %}\n{{- '<|im_start|>system\n' + system_message + '<|im_end|>\n' }}\n{%- endif %}\n\n\n{%- if tools is not none %}\n{% set task_instruction %}You are a tool calling assistant. In order to complete the user's request, you need to select one or more appropriate tools from the following tools and fill in the correct values for the tool parameters. Your specific tasks are:\n1. Make one or more function/tool calls to meet the request based on the question.\n2. If none of the function can be used, point it out and refuse to answer.\n3. If the given question lacks the parameters required by the function, also point it out.\n\nThe following are characters that may interact with you\n1. user: Provides query or additional information.\n2. tool: Returns the results of the tool calling.\n{% endset %}\n\n{% set format_instruction %}\nThe output MUST strictly adhere to the following JSON format, and NO other text MUST be included.\nThe example format is as follows. Please make sure the parameter type is correct. If no function call is needed, please directly output an empty list '[]'\n```\n[\n {\"name\": \"func_name1\", \"arguments\": {\"argument1\": \"value1\", \"argument2\": \"value2\"}},\n ... (more tool calls as required)\n]\n```\n{% endset %}\n{{- '<|im_start|>user\n[BEGIN OF TASK INSTRUCTION]\n' + task_instruction + '\n[END OF TASK INSTRUCTION]\n\n'}}\n {{- '[BEGIN OF AVAILABLE_TOOLS]\n' }}\n {{- tools|string }}\n {{- '\n[END OF AVAILABLE_TOOLS]\n\n' }}\n {{- '\n[BEGIN OF TASK INSTRUCTION]\n' + format_instruction + '\n[END OF TASK INSTRUCTION]\n\n<|im_end|>\n' }}\n{%- endif %}\n\n{%- for message in loop_messages %}\n {%- set role = message['role'] %}\n {%- set content = message['content'] %}\n {{- '<|im_start|>'+ role +'\n' + content + '<|im_end|>\n'}}\n{%- endfor %}\n{{- '<|im_start|>assistant\n' }}",
|
199 |
-
"clean_up_tokenization_spaces": false,
|
200 |
-
"eos_token": "<|im_end|>",
|
201 |
-
"errors": "replace",
|
202 |
-
"extra_special_tokens": {},
|
203 |
-
"model_max_length": 131072,
|
204 |
-
"pad_token": "<|endoftext|>",
|
205 |
-
"padding_side": "right",
|
206 |
-
"split_special_tokens": false,
|
207 |
-
"tokenizer_class": "Qwen2Tokenizer",
|
208 |
-
"unk_token": null
|
209 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v2_figures/bfcl.png
DELETED
Binary file (315 kB)
|
|
v2_figures/others-v2.png
DELETED
Binary file (122 kB)
|
|
vocab.json
DELETED
The diff for this file is too large to render.
See raw diff
|
|