|
--- |
|
base_model: FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- qwen2 |
|
- trl |
|
- reason |
|
- Chain-of-Thought |
|
- deep thinking |
|
license: apache-2.0 |
|
language: |
|
- en |
|
datasets: |
|
- bespokelabs/Bespoke-Stratos-17k |
|
- Daemontatox/Deepthinking-COT |
|
- Daemontatox/Qwqloncotam |
|
- Daemontatox/Reasoning_am |
|
library_name: transformers |
|
--- |
|
 |
|
# **FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview (Fine-Tuned)** |
|
|
|
## **Model Overview** |
|
|
|
This model is a fine-tuned version of **FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview**, based on the **Qwen2** architecture. It has been optimized using **Unsloth** for significantly improved training efficiency, reducing compute time by **2x** while maintaining high performance across various NLP benchmarks. |
|
|
|
Fine-tuning was performed using **Hugging Face’s TRL (Transformers Reinforcement Learning) library**, ensuring adaptability for **complex reasoning, natural language generation (NLG), and conversational AI** tasks. |
|
|
|
## **Model Details** |
|
|
|
- **Developed by:** Daemontatox |
|
- **Base Model:** [FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview) |
|
- **License:** Apache-2.0 |
|
- **Model Type:** Qwen2-based large-scale transformer |
|
- **Optimization Framework:** [Unsloth](https://github.com/unslothai/unsloth) |
|
- **Fine-tuning Methodology:** LoRA (Low-Rank Adaptation) & Full Fine-Tuning |
|
- **Quantization Support:** 4-bit and 8-bit for deployment on resource-constrained devices |
|
- **Training Library:** [Hugging Face TRL](https://huggingface.co/docs/trl/) |
|
|
|
--- |
|
|
|
## **Training & Fine-Tuning Details** |
|
|
|
### **Optimization with Unsloth** |
|
Unsloth significantly accelerates fine-tuning by reducing memory overhead and improving hardware utilization. The model was fine-tuned **twice as fast** as conventional methods, leveraging **Flash Attention 2** and **PagedAttention** for enhanced performance. |
|
|
|
|
|
|
|
### **Fine-Tuning Method** |
|
The model was fine-tuned using **parameter-efficient techniques**, including: |
|
- **QLoRA (Quantized LoRA)** for reduced memory usage. |
|
- **Full fine-tuning** on select layers to maintain original capabilities while improving specific tasks. |
|
- **RLHF (Reinforcement Learning with Human Feedback)** for improved alignment with human preferences. |
|
|
|
--- |
|
|
|
|
|
--- |
|
|
|
## **Intended Use & Applications** |
|
|
|
### **Primary Use Cases** |
|
- **Conversational AI**: Enhances chatbot interactions with **better contextual awareness** and logical coherence. |
|
- **Text Generation & Completion**: Ideal for **content creation**, **report writing**, and **creative writing**. |
|
- **Mathematical & Logical Reasoning**: Can assist in **education**, **problem-solving**, and **automated theorem proving**. |
|
- **Research & Development**: Useful for **scientific research**, **data analysis**, and **language modeling experiments**. |
|
|
|
### **Deployment** |
|
The model supports **4-bit and 8-bit quantization**, making it **deployable on resource-constrained devices** while maintaining high performance. |
|
|
|
--- |
|
|
|
## **Limitations & Ethical Considerations** |
|
|
|
### **Limitations** |
|
- **Bias & Hallucination**: The model may still **generate biased or hallucinated outputs**, especially in **highly subjective** or **low-resource** domains. |
|
- **Computation Requirements**: While optimized, the model **still requires significant GPU resources** for inference at full precision. |
|
- **Context Length Constraints**: Long-context understanding is improved, but **performance may degrade** on extremely long prompts. |
|
|
|
### **Ethical Considerations** |
|
- **Use responsibly**: The model should not be used for **misinformation**, **deepfake generation**, or **harmful AI applications**. |
|
- **Bias Mitigation**: Efforts have been made to **reduce bias**, but users should **validate outputs** in sensitive applications. |
|
|
|
--- |
|
|
|
## **How to Use the Model** |
|
|
|
### **Example Code for Inference** |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name = "Daemontatox/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
|
input_text = "Explain the significance of reinforcement learning in AI." |
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
|
|
output = model.generate(**inputs, max_length=200) |
|
print(tokenizer.decode(output[0], skip_special_tokens=True)) |
|
|
|
Using with Unsloth (Optimized LoRA Inference) |
|
|
|
from unsloth import FastAutoModelForCausalLM |
|
|
|
model = FastAutoModelForCausalLM.from_pretrained( |
|
"Daemontatox/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview", |
|
load_in_4bit=True # Efficient deployment |
|
) |
|
|
|
|
|
--- |
|
|
|
Acknowledgments |
|
|
|
Special thanks to: |
|
|
|
Unsloth AI for their efficient fine-tuning framework. |
|
|
|
The open-source AI community for continuous innovation. |
|
|
|
|
|
--- |
|
|
|
|