|
--- |
|
base_model: ibm-granite/granite-3.1-2b-instruct |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- granite |
|
- trl |
|
- grpo |
|
- ruslanmv |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
|
|
# Granite-3.1-2B-Reasoning (Fine-tuned for Logical Reasoning) |
|
|
|
## Model Overview |
|
|
|
This model is a fine-tuned version of **ibm-granite/granite-3.1-2b-instruct**, specifically optimized for **enhanced reasoning capabilities**. Fine-tuning has been conducted to improve its performance on logical reasoning, structured problem-solving, and complex analytical tasks. |
|
|
|
- **Developed by:** [ruslanmv](https://huggingface.co/ruslanmv) |
|
- **License:** Apache 2.0 |
|
- **Base Model:** [ibm-granite/granite-3.1-2b-instruct](https://huggingface.co/ibm-granite/granite-3.1-2b-instruct) |
|
- **Fine-tuned for:** Logical reasoning, structured problem-solving, long-context tasks |
|
- **Supported Languages:** English |
|
|
|
--- |
|
|
|
## Model Summary |
|
|
|
**Granite-3.1-2B-Reasoning** is part of IBM’s **Granite 3.1** language model series, which supports extended context lengths and strong multi-domain performance. This fine-tuned variant enhances the model's ability to process complex reasoning tasks efficiently. |
|
|
|
### Improvements Over Base Model: |
|
✅ Improved **reasoning** and **problem-solving** skills |
|
✅ Optimized for **instruction-following** and **logical deduction** |
|
✅ Maintains the **efficiency and robustness** of Granite-3.1 |
|
|
|
--- |
|
|
|
## Installation & Usage |
|
|
|
Install the required dependencies: |
|
|
|
```bash |
|
pip install torch torchvision torchaudio |
|
pip install accelerate |
|
pip install transformers |
|
``` |
|
|
|
### Running the Model |
|
|
|
Use the following Python snippet to load and generate text with the fine-tuned model: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig |
|
import torch |
|
|
|
# Model and tokenizer |
|
model_name = "ruslanmv/granite-3.1-2b-Reasoning" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
device_map='auto', # or 'cuda' if you have only one GPU |
|
torch_dtype=torch.float16, # Use float16 for faster and less memory intensive inference |
|
load_in_4bit=True # Enable 4-bit quantization for lower memory usage - requires bitsandbytes |
|
) |
|
|
|
# Prepare dataset |
|
SYSTEM_PROMPT = """ |
|
Respond in the following format: |
|
<reasoning> |
|
... |
|
</reasoning> |
|
<answer> |
|
... |
|
</answer> |
|
""" |
|
text = tokenizer.apply_chat_template([ |
|
{"role" : "system", "content" : SYSTEM_PROMPT}, |
|
{"role" : "user", "content" : "Calculate pi."}, |
|
], tokenize = False, add_generation_prompt = True) |
|
|
|
inputs = tokenizer(text, return_tensors="pt").to("cuda") # Move input tensor to GPU |
|
|
|
# Sampling parameters |
|
generation_config = GenerationConfig( |
|
temperature = 0.8, |
|
top_p = 0.95, |
|
max_new_tokens = 1024, # Equivalent to max_tokens in the original code, but for generation |
|
) |
|
|
|
# Inference |
|
with torch.inference_mode(): # Use inference mode for faster generation |
|
outputs = model.generate(**inputs, generation_config=generation_config) |
|
|
|
output = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
# Find the start of the actual response |
|
start_index = output.find("assistant") |
|
if start_index != -1: |
|
# Remove the initial part including "assistant" |
|
output = output[start_index + len("assistant"):].strip() |
|
|
|
print(output) |
|
``` |
|
|
|
and the output is : |
|
``` |
|
<reasoning> |
|
Pi is an irrational number, which means it cannot be precisely calculated using finite decimal or fractional notation. It is typically represented by the Greek letter π and its approximate value is 3.14159. However, for a more precise calculation, we can use mathematical algorithms like the Leibniz formula for π or the Gregory-Leibniz series. |
|
|
|
The Leibniz formula for π is: |
|
|
|
π = 4 * (1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + 1/13 - 1/15 +...) |
|
|
|
This series converges slowly, so many terms are needed for a good approximation. For instance, using 10 terms, the approximation would be: |
|
|
|
π ≈ 4 * (1 - 0.3333333333333333 + 0.1111111111111111 - 0.0344827586206897 + 0.0090040875518672 - 0.0025958422650073 + 0.0006929403729561 - 0.0001866279043531 + 0.0000499753694946 - 0.0000133386323746 + 0.0000035303398593 - 0.0000009009433996) |
|
|
|
π ≈ 3.141592653589793 |
|
|
|
This is a rough approximation of π using 10 terms. For a more precise value, you can use more terms or employ other algorithms. |
|
|
|
</reasoning> |
|
|
|
<answer> |
|
π ≈ 3.141592653589793 |
|
</answer> |
|
``` |
|
|
|
--- |
|
|
|
## Intended Use |
|
|
|
Granite-3.1-2B-Reasoning is designed for tasks requiring structured **reasoning**, including: |
|
|
|
- **Logical and analytical problem-solving** |
|
- **Text-based reasoning tasks** |
|
- **Mathematical and symbolic reasoning** |
|
- **Advanced instruction-following** |
|
|
|
--- |
|
|
|
## License & Acknowledgments |
|
|
|
This model is released under the **Apache 2.0** license. It is fine-tuned from IBM’s **Granite 3.1-2B-Instruct** model. Special thanks to the **IBM Granite Team** for developing the base model. |
|
|
|
For more details, visit the [IBM Granite Documentation](https://huggingface.co/ibm-granite). |
|
|
|
--- |
|
|
|
### Citation |
|
|
|
If you use this model in your research or applications, please cite: |
|
|
|
``` |
|
@misc{ruslanmv2025granite, |
|
title={Fine-Tuning Granite-3.1 for Advanced Reasoning}, |
|
author={Ruslan M.V.}, |
|
year={2025}, |
|
url={https://huggingface.co/ruslanmv/granite-3.1-2b-Reasoning} |
|
} |
|
``` |
|
|