File size: 3,382 Bytes

b269b2a
 
 
 
 
7a27fba
 
 
 
 
 
 
 
 
 
 
 
 
452ce6e
 
 
7ab34c1
64ee892
 
7ab34c1
64ee892
 
14fcb26
b269b2a
dbee09c
7a27fba
 
 
0b27704
dcd846a
c8890a9
7a27fba
9b2e667
7a27fba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9b2e667
7a27fba
 
 
 
 
 
 
 
 
 
 
 
e63103e
7a27fba

---
library_name: transformers
tags:
- trl
- grpo
- rl
- superthoughts
- reasoning
- cot
license: apache-2.0
datasets:
- openai/gsm8k
- Pinkstack/intructions-sft-sharegpt
language:
- en
base_model:
- HuggingFaceTB/SmolLM2-1.7B-Instruct
pipeline_tag: text-generation
widget:
- messages:
  - role: user
    content: "You must act in a conversational matter and always include at the start <think> ... </think> <output> ... </output> tokens.\n\nAre cats cool?"
- messages:
  - role: user
    content: "You must act in a conversational matter and always include at the start <think> ... </think> <output> ... </output> tokens.\n\nHello!"
- messages:
  - role: user
    content: "You must act in a conversational matter and always include at the start <think> ... </think> <output> ... </output> tokens.\n\n2x-2=6, how much is X?"
---
Free online chat with web search capabilities (demo): https://huggingface.co/spaces/Pinkstack/Chat-with-superthoughts-lite
![superthoughts lite](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/K5kYIHYj2aX2kB6MlcM9O.png)

# Information
Advanced, high-quality and **lite** reasoning for a tiny size that you can run on your phone.

At original quality, it runs at ~400 tokens/second on a single H100 Nvidia GPU from Friendli. 

Trained similarly to Deepseek R1, we used Smollm2 as a base model, then we've SFT fine tuned on reasoning using our own private superthoughts instruct dataset which includes a mix of code, website generation, day-to-day chats, math and counting problems. And then we modified the tokenizer slightly, after the SFT fine tuning we used GRPO to further amplify it's mathematics & problem solving abilities.
# Format
```
<|im_start|>user
How many R's in strawberry<|im_end|>
<|im_start|>assistant
<think>
Alright, the user has asked how many R's in the word strawberry, that's easy! I just need to count each instance of the letter 'R' in the word 's-t-r-a-w-b-e-r-r-y' and then find out how many R's there are, lets count!
S - Not an R,
T - Not an R,
R - First instance of the letter R! (1),
A - Not an R,
W - Not an R,
B - Not an R,
E - Not an R,
R - Great! Second instance of the letter R. (2),
R - Third instance of the letter R. (3),
Y - Not an R.

So, i've counted all the letters correctly, meaning that I am sure that there are 3 R's in the word Strawberry. I should probably let the user know.
</think>
<output>3
</output><|im_end|>
```
It is very reccomend to use a low temperature, higher temperatures may cause it to not think. 
# system prompt
(important to ensure it would always think, output).
```
respond in the following format:
<think>
...
</think>
<output>
...
</output>
```
# Examples:
all responses below generated with our system prompt and a temperature of 0.7.
Generated inside the android application, ChatterUI via GGUF Q8, using the model's prompt format. and our 
1)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/5veZJmkjuv_7W7pKhvsu0.png)
2)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/pAwPdVkEZ7rnFf-TZ5tMU.png)
3)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/FDaWAAqgv2kvoZvjl8gjl.png)

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** apache-2.0
- **Finetuned from model :** HuggingFaceTB/SmolLM2-1.7B-Instruct