Whisprell-DeepSeek-R1-Enhanced-1.5B

Introduction

Whisprell-DeepSeek-R1-Enhanced-1.5B is a Chain-of-Thought (CoT) reasoning focused model developed by NexThinkLabs. The model is based on DeepSeek's DeepSeek-R1-Distill-Qwen-1.5B and has been further fine-tuned to enhance reasoning capabilities while maintaining computational efficiency.

Model Details

Base Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
Model Type: Chain-of-Thought Reasoning
Language: Multilingual (Primary: English)
Context Length: 32k tokens
Parameters: 1.5B

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("NexThinkLabsAI/Whisprell-DeepSeek-R1-Enhanced-1.5B)
tokenizer = AutoTokenizer.from_pretrained("NexThinkLabsAI/Whisprell-DeepSeek-R1-Enhanced-1.5B")

Usage Recommendations

Temperature: 0.5-0.7 (0.6 recommended)
Avoid system prompts - include instructions in user prompt
For math problems: Include "Please reason step by step, and put your final answer within \boxed{}"
Enforce thinking pattern with "<think>\n" at response start

License

This model is under Personal Proprietary License. The base model (deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) is under MIT License.

Acknowledgments

We thank DeepSeek AI for their DeepSeek-R1-Distill-Qwen-1.5B model which served as the foundation for this work.

Contact

For questions and support, please:

Open an issue on GitHub, HuggingFace
Contact us at [email protected]

NexThinkLabsAI
/

Whisprell-DeepSeek-R1-Enhanced-1.5B