Update README.md

cb49937 verified 10 days ago

5.29 kB

	---
	base_model: ibm-granite/granite-3.1-2b-instruct
	tags:
	- text-generation-inference
	- transformers
	- granite
	- trl
	- grpo
	- ruslanmv
	license: apache-2.0
	language:
	- en
	---

	# Granite-3.1-2B-Reasoning (Fine-tuned for Logical Reasoning)

	## Model Overview

	This model is a fine-tuned version of ibm-granite/granite-3.1-2b-instruct, specifically optimized for enhanced reasoning capabilities. Fine-tuning has been conducted to improve its performance on logical reasoning, structured problem-solving, and complex analytical tasks.

	- Developed by: [ruslanmv](https://huggingface.co/ruslanmv)
	- License: Apache 2.0
	- Base Model: [ibm-granite/granite-3.1-2b-instruct](https://huggingface.co/ibm-granite/granite-3.1-2b-instruct)
	- Fine-tuned for: Logical reasoning, structured problem-solving, long-context tasks
	- Supported Languages: English

	---

	## Model Summary

	Granite-3.1-2B-Reasoning is part of IBM’s Granite 3.1 language model series, which supports extended context lengths and strong multi-domain performance. This fine-tuned variant enhances the model's ability to process complex reasoning tasks efficiently.

	### Improvements Over Base Model:
	✅ Improved reasoning and problem-solving skills
	✅ Optimized for instruction-following and logical deduction
	✅ Maintains the efficiency and robustness of Granite-3.1

	---

	## Installation & Usage

	Install the required dependencies:

	```bash
	pip install torch torchvision torchaudio
	pip install accelerate
	pip install transformers
	```

	### Running the Model

	Use the following Python snippet to load and generate text with the fine-tuned model:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
	import torch

	# Model and tokenizer
	model_name = "ruslanmv/granite-3.1-2b-Reasoning"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	device_map='auto', # or 'cuda' if you have only one GPU
	torch_dtype=torch.float16, # Use float16 for faster and less memory intensive inference
	load_in_4bit=True # Enable 4-bit quantization for lower memory usage - requires bitsandbytes
	)

	# Prepare dataset
	SYSTEM_PROMPT = """
	Respond in the following format:
	<reasoning>
	...
	</reasoning>
	<answer>
	...
	</answer>
	"""
	text = tokenizer.apply_chat_template([
	{"role" : "system", "content" : SYSTEM_PROMPT},
	{"role" : "user", "content" : "Calculate pi."},
	], tokenize = False, add_generation_prompt = True)

	inputs = tokenizer(text, return_tensors="pt").to("cuda") # Move input tensor to GPU

	# Sampling parameters
	generation_config = GenerationConfig(
	temperature = 0.8,
	top_p = 0.95,
	max_new_tokens = 1024, # Equivalent to max_tokens in the original code, but for generation
	)

	# Inference
	with torch.inference_mode(): # Use inference mode for faster generation
	outputs = model.generate(**inputs, generation_config=generation_config)

	output = tokenizer.decode(outputs[0], skip_special_tokens=True)

	# Find the start of the actual response
	start_index = output.find("assistant")
	if start_index != -1:
	# Remove the initial part including "assistant"
	output = output[start_index + len("assistant"):].strip()

	print(output)
	```

	and the output is :
	```
	<reasoning>
	Pi is an irrational number, which means it cannot be precisely calculated using finite decimal or fractional notation. It is typically represented by the Greek letter π and its approximate value is 3.14159. However, for a more precise calculation, we can use mathematical algorithms like the Leibniz formula for π or the Gregory-Leibniz series.

	The Leibniz formula for π is:

	π = 4 * (1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + 1/13 - 1/15 +...)

	This series converges slowly, so many terms are needed for a good approximation. For instance, using 10 terms, the approximation would be:

	π ≈ 4 * (1 - 0.3333333333333333 + 0.1111111111111111 - 0.0344827586206897 + 0.0090040875518672 - 0.0025958422650073 + 0.0006929403729561 - 0.0001866279043531 + 0.0000499753694946 - 0.0000133386323746 + 0.0000035303398593 - 0.0000009009433996)

	π ≈ 3.141592653589793

	This is a rough approximation of π using 10 terms. For a more precise value, you can use more terms or employ other algorithms.

	</reasoning>

	<answer>
	π ≈ 3.141592653589793
	</answer>
	```

	---

	## Intended Use

	Granite-3.1-2B-Reasoning is designed for tasks requiring structured reasoning, including:

	- Logical and analytical problem-solving
	- Text-based reasoning tasks
	- Mathematical and symbolic reasoning
	- Advanced instruction-following

	---

	## License & Acknowledgments

	This model is released under the Apache 2.0 license. It is fine-tuned from IBM’s Granite 3.1-2B-Instruct model. Special thanks to the IBM Granite Team for developing the base model.

	For more details, visit the [IBM Granite Documentation](https://huggingface.co/ibm-granite).

	---

	### Citation

	If you use this model in your research or applications, please cite:

	```
	@misc{ruslanmv2025granite,
	title={Fine-Tuning Granite-3.1 for Advanced Reasoning},
	author={Ruslan M.V.},
	year={2025},
	url={https://huggingface.co/ruslanmv/granite-3.1-2b-Reasoning}
	}
	```