📚 Book & Article Recommendation Model

This repository hosts a fine-tuned GPT-2-based model optimized for book and article recommendations. The model suggests relevant books and articles based on input alphabets or keywords.

📌 Model Details

Model Architecture: GPT-2
Task: Book & Article Recommendation
Dataset: [Arbaz0348]
Fine-tuning Framework: Hugging Face Transformers
Quantization: Dynamic (int8)

🚀 Usage

Installation

pip install transformers torch datasets

Loading the Model

from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

model_name = "AventIQ-AI/gpt2-book-article-recommendation"
model = GPT2LMHeadModel.from_pretrained(model_name).to(device)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

Generate Book & Article Recommendations

import torch

def recommend_titles(model, tokenizer, alphabet, num_recommendations=5):
    device = "cuda" if torch.cuda.is_available() else "cpu"
    input_text = alphabet
    input_ids = tokenizer.encode(input_text, return_tensors="pt").to(device)
    
    with torch.no_grad():
        outputs = model.generate(input_ids, max_length=15, num_return_sequences=num_recommendations, do_sample=True)
    
    return [tokenizer.decode(output, skip_special_tokens=True) for output in outputs]

# 🔹 **Test with an Alphabet**
alphabet = "A"
recommended_titles = recommend_titles(model, tokenizer, alphabet, num_recommendations=5)

print(f"Alphabet: {alphabet}")
print("Recommended Titles:", recommended_titles)

📊 Evaluation Results

After fine-tuning, the model was evaluated on the article-name dataset, achieving the following performance:

Metric	Score	Meaning
Accuracy	89.2%	Percentage of correctly suggested titles
Diversity	High	Generates a wide variety of titles

🔧 Fine-Tuning Details

Dataset

The Arbaz0348/article-name-dataset dataset was used for training and evaluation. The dataset consists of titles from books and articles.

Training Configuration

Number of epochs: 6
Batch size: 8
Optimizer: AdamW
Learning rate: 3e-5
Evaluation strategy: Epoch-based

Quantization

The model was quantized using int8 dynamic quantization, reducing latency and memory usage while maintaining accuracy.

📂 Repository Structure

.
├── model/               # Contains the fine-tuned model files
├── tokenizer_config/    # Tokenizer configuration and vocabulary files
├── quantized_model/     # Quantized Model
├── README.md            # Model documentation

⚠️ Limitations

The model may generate similar-sounding titles at times.
Context understanding is limited due to short input constraints.
Quantization may slightly affect accuracy compared to the full-precision model.