---
base_model: ibm-granite/granite-3.1-8b-instruct
tags:
- text-generation
- transformers
- safetensors
- english
- text-generation-inference
- ruslanmv
- granite
- trl
- inference-endpoints
license: apache-2.0
language:
- en
---

# Granite-3.1-8B-Reasoning-LORA (Efficient Fine-Tuned Model)

## Model Overview

This model is a **LoRA fine-tuned version** of **ibm-granite/granite-3.1-8b-instruct**, optimized for **advanced reasoning tasks** while maintaining **efficiency** and **low computational cost**. Using **LoRA (Low-Rank Adaptation)**, this model retains the full power of the base model while applying targeted modifications for **logical and analytical reasoning**.

- **Developed by:** [ruslanmv](https://huggingface.co/ruslanmv)  
- **License:** Apache 2.0  
- **Base Model:** [ibm-granite/granite-3.1-8b-instruct](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct)  
- **Fine-tuned for:** Logical reasoning, structured problem-solving, long-context tasks  
- **Training Method:** **LoRA (Low-Rank Adaptation)**  
- **Supported Languages:** English  

---

## Why Use the LoRA Version?

This **LoRA fine-tuned model** provides **several benefits**:  

✅ **Memory-efficient** fine-tuning with LoRA  
✅ **2x Faster Training** using **Unsloth and Hugging Face TRL**  
✅ **Retains the base model’s capabilities** while enhancing reasoning skills  
✅ **Easier to merge with other adapters** or apply to specific tasks  

---

## Installation & Usage  

To use this **LoRA fine-tuned** model, install the necessary dependencies:

```bash
pip install torch torchvision torchaudio
pip install accelerate
pip install transformers
pip install peft
pip install bitsandbytes
```

### Running the Model  

Load and merge the LoRA adapter with the base model:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
base_model_path = "ibm-granite/granite-3.1-8b-instruct"
lora_model_path = "ruslanmv/granite-3.1-8b-Reasoning-LORA"

tokenizer = AutoTokenizer.from_pretrained(base_model_path)
model = AutoModelForCausalLM.from_pretrained(base_model_path, device_map="auto")

# Load LoRA adapter
model = PeftModel.from_pretrained(model, lora_model_path)
model.eval()

input_text = "Can you explain the difference between inductive and deductive reasoning?"
input_tokens = tokenizer(input_text, return_tensors="pt").to(device)

output = model.generate(**input_tokens, max_length=4000)
output_text = tokenizer.batch_decode(output)

print(output_text)
```

---

## Intended Use  

Granite-3.1-8B-Reasoning-LORA is optimized for **efficient reasoning** while keeping **computational costs low**, making it ideal for:  

- **Logical and analytical problem-solving**  
- **Text-based reasoning tasks**  
- **Mathematical and symbolic reasoning**  
- **Advanced instruction-following**  

This LoRA-based fine-tuning method is particularly useful for **lightweight deployment** and **quick adaptability** to specific tasks.

---

## License & Acknowledgments  

This model is released under the **Apache 2.0** license. It is fine-tuned from IBM’s **Granite 3.1-8B-Instruct** model using **LoRA fine-tuning**. Special thanks to the **IBM Granite Team** for developing the base model.  

For more details, visit the [IBM Granite Documentation](https://huggingface.co/ibm-granite).  

---

### Citation  

If you use this model in your research or applications, please cite:  

```
@misc{ruslanmv2025granite,
  title={LoRA Fine-Tuning of Granite-3.1-8B for Advanced Reasoning},
  author={Ruslan M.V.},
  year={2025},
  url={https://huggingface.co/ruslanmv/granite-3.1-8b-Reasoning-LORA}
}
```