YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

πŸ“š Book & Article Recommendation Model

This repository hosts a fine-tuned GPT-2-based model optimized for book and article recommendations. The model suggests relevant books and articles based on input alphabets or keywords.

πŸ“Œ Model Details

  • Model Architecture: GPT-2
  • Task: Book & Article Recommendation
  • Dataset: [Arbaz0348]
  • Fine-tuning Framework: Hugging Face Transformers
  • Quantization: Dynamic (int8)

πŸš€ Usage

Installation

pip install transformers torch datasets

Loading the Model

from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

model_name = "AventIQ-AI/gpt2-book-article-recommendation"
model = GPT2LMHeadModel.from_pretrained(model_name).to(device)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

Generate Book & Article Recommendations

import torch

def recommend_titles(model, tokenizer, alphabet, num_recommendations=5):
    device = "cuda" if torch.cuda.is_available() else "cpu"
    input_text = alphabet
    input_ids = tokenizer.encode(input_text, return_tensors="pt").to(device)
    
    with torch.no_grad():
        outputs = model.generate(input_ids, max_length=15, num_return_sequences=num_recommendations, do_sample=True)
    
    return [tokenizer.decode(output, skip_special_tokens=True) for output in outputs]

# πŸ”Ή **Test with an Alphabet**
alphabet = "A"
recommended_titles = recommend_titles(model, tokenizer, alphabet, num_recommendations=5)

print(f"Alphabet: {alphabet}")
print("Recommended Titles:", recommended_titles)

πŸ“Š Evaluation Results

After fine-tuning, the model was evaluated on the article-name dataset, achieving the following performance:

Metric Score Meaning
Accuracy 89.2% Percentage of correctly suggested titles
Diversity High Generates a wide variety of titles

πŸ”§ Fine-Tuning Details

Dataset

The Arbaz0348/article-name-dataset dataset was used for training and evaluation. The dataset consists of titles from books and articles.

Training Configuration

  • Number of epochs: 6
  • Batch size: 8
  • Optimizer: AdamW
  • Learning rate: 3e-5
  • Evaluation strategy: Epoch-based

Quantization

The model was quantized using int8 dynamic quantization, reducing latency and memory usage while maintaining accuracy.

πŸ“‚ Repository Structure

.
β”œβ”€β”€ model/               # Contains the fine-tuned model files
β”œβ”€β”€ tokenizer_config/    # Tokenizer configuration and vocabulary files
β”œβ”€β”€ quantized_model/     # Quantized Model
β”œβ”€β”€ README.md            # Model documentation

⚠️ Limitations

  • The model may generate similar-sounding titles at times.
  • Context understanding is limited due to short input constraints.
  • Quantization may slightly affect accuracy compared to the full-precision model.
Downloads last month
0
Safetensors
Model size
124M params
Tensor type
F32
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.