nomadlx commited on
Commit
818fdbe
·
1 Parent(s): 219ab05

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -3
README.md CHANGED
@@ -11,7 +11,7 @@ library_name: transformers
11
  # Confucius-o1-14B
12
 
13
  ## Introduction
14
- **Confucius-o1-14B** is a o1-like reasoning model developed by the NETEASE Youdao Team, it can be easily deployed on a single GPU without quantization. This model is based on the Qwen2.5-14B-Instruct model and adopts a two-stage learning strategy, enabling the lightweight 14B model to possess thinking abilities similar to those of o1. What sets it apart is that after generating the chain of thought, it can summarize a step-by-step problem-solving process from the chain of thought on its own. This can prevent users from getting bogged down in the complex chain of thought and allows them to easily obtain the correct problem-solving ideas and answers.
15
 
16
  However, there are some limitations that must be stated in advance:
17
  1. **Scenario Limitations**: Our optimization is only carried out on data from the K12 mathematics scenario, and the effectiveness has only been verified in math-related benchmark tests. The performance of the model in non-mathematical scenarios has not been tested, so we cannot guarantee its quality and effectiveness in other fields.
@@ -64,10 +64,38 @@ USER_PROMPT_TEMPLATE = """现在,让我们开始吧!
64
 
65
  Then you can create your `messages` as follows and use them to request model results. You just need to fill in your instructions in the "question" field.
66
  ```python
 
 
 
 
 
 
 
 
 
 
 
67
  messages = [
68
  {'role': 'system', 'content': SYSTEM_PROMPT_TEMPLATE},
69
  {'role': 'user', 'content': USER_PROMPT_TEMPLATE.format(question=question)},
70
  ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  ```
72
 
73
  After obtaining the model results, you can parse out the "thinking" and "summary" parts as follows.
@@ -86,7 +114,7 @@ def parse_result_nostep(result):
86
  summary = summary_list[0].strip()
87
  return thinking, summary
88
 
89
- thinking, summary = parse_result_nostep(result)
90
  ```
91
 
92
  ## Citation
@@ -94,7 +122,7 @@ thinking, summary = parse_result_nostep(result)
94
  If you find our work helpful, feel free to give us a cite.
95
  ```
96
  @misc{confucius-o1,
97
- author = {NETEASE Youdao Team},
98
  title = {Confucius-o1: Open-Source Lightweight Large Models to Achieve Excellent Chain-of-Thought Reasoning on Consumer-Grade Graphics Cards.},
99
  url = {},
100
  month = {January},
 
11
  # Confucius-o1-14B
12
 
13
  ## Introduction
14
+ **Confucius-o1-14B** is a o1-like reasoning model developed by the NetEase Youdao Team, it can be easily deployed on a single GPU without quantization. This model is based on the Qwen2.5-14B-Instruct model and adopts a two-stage learning strategy, enabling the lightweight 14B model to possess thinking abilities similar to those of o1. What sets it apart is that after generating the chain of thought, it can summarize a step-by-step problem-solving process from the chain of thought on its own. This can prevent users from getting bogged down in the complex chain of thought and allows them to easily obtain the correct problem-solving ideas and answers.
15
 
16
  However, there are some limitations that must be stated in advance:
17
  1. **Scenario Limitations**: Our optimization is only carried out on data from the K12 mathematics scenario, and the effectiveness has only been verified in math-related benchmark tests. The performance of the model in non-mathematical scenarios has not been tested, so we cannot guarantee its quality and effectiveness in other fields.
 
64
 
65
  Then you can create your `messages` as follows and use them to request model results. You just need to fill in your instructions in the "question" field.
66
  ```python
67
+ from transformers import AutoModelForCausalLM, AutoTokenizer
68
+
69
+ model_name = "netease-youdao/Confucius-o1-14B"
70
+
71
+ model = AutoModelForCausalLM.from_pretrained(
72
+ model_name,
73
+ torch_dtype="auto",
74
+ device_map="auto"
75
+ )
76
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
77
+
78
  messages = [
79
  {'role': 'system', 'content': SYSTEM_PROMPT_TEMPLATE},
80
  {'role': 'user', 'content': USER_PROMPT_TEMPLATE.format(question=question)},
81
  ]
82
+
83
+ text = tokenizer.apply_chat_template(
84
+ messages,
85
+ tokenize=False,
86
+ add_generation_prompt=True
87
+ )
88
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
89
+
90
+ generated_ids = model.generate(
91
+ **model_inputs,
92
+ max_new_tokens=16384
93
+ )
94
+ generated_ids = [
95
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
96
+ ]
97
+
98
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
99
  ```
100
 
101
  After obtaining the model results, you can parse out the "thinking" and "summary" parts as follows.
 
114
  summary = summary_list[0].strip()
115
  return thinking, summary
116
 
117
+ thinking, summary = parse_result_nostep(response)
118
  ```
119
 
120
  ## Citation
 
122
  If you find our work helpful, feel free to give us a cite.
123
  ```
124
  @misc{confucius-o1,
125
+ author = {NetEase Youdao Team},
126
  title = {Confucius-o1: Open-Source Lightweight Large Models to Achieve Excellent Chain-of-Thought Reasoning on Consumer-Grade Graphics Cards.},
127
  url = {},
128
  month = {January},