alphaaico commited on
Commit
d4b0b57
·
verified ·
1 Parent(s): ab7a928

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -41,7 +41,7 @@ Think about it: most AI models blindly generate reasoning even when unnecessary,
41
  - Reasoning & Self-Reflection: The model first decides if reasoning is necessary and then either provides step-by-step logic or directly answers the question.
42
  - Structured Output: Responses follow a strict format with `<think>`, `<reflection>`, and `<answer>` sections, ensuring clarity and interpretability.
43
  - Optimized Training: Trained using GRPO (Guided Reward Policy Optimization) to enforce structured responses and improve decision-making.
44
- - Efficient Inference: Fine-tuned with Unsloth & Hugging Faces TRL, ensuring faster inference speeds and optimized resource utilization.
45
 
46
  ## Prompt Structure
47
 
@@ -97,7 +97,7 @@ This model is released under the Apache-2.0 license.
97
 
98
  ## Acknowledgments
99
 
100
- Special thanks to the Unsloth team for optimizing the fine-tuning pipeline and to Hugging Faces TRL for enabling advanced fine-tuning techniques.
101
 
102
  ## Security & Format Considerations
103
 
 
41
  - Reasoning & Self-Reflection: The model first decides if reasoning is necessary and then either provides step-by-step logic or directly answers the question.
42
  - Structured Output: Responses follow a strict format with `<think>`, `<reflection>`, and `<answer>` sections, ensuring clarity and interpretability.
43
  - Optimized Training: Trained using GRPO (Guided Reward Policy Optimization) to enforce structured responses and improve decision-making.
44
+ - Efficient Inference: Fine-tuned with Unsloth & Hugging Face's TRL, ensuring faster inference speeds and optimized resource utilization.
45
 
46
  ## Prompt Structure
47
 
 
97
 
98
  ## Acknowledgments
99
 
100
+ Special thanks to the Unsloth team for optimizing the fine-tuning pipeline and to Hugging Face's TRL for enabling advanced fine-tuning techniques.
101
 
102
  ## Security & Format Considerations
103