Dante-Zero Fine-tuned Model
This model was fine-tuned using Reinforcement Learning with Group Relative Policy Optimization (GRPO) to generate Dante-style poetry in endecasillabi (11-syllable lines).
Model Details
- Base Model: PleIAs/Pleias-350m-Preview
- Training Method: GRPO (Group Relative Policy Optimization )
- Training Data: 1,000 chunks from Dante's Divine Comedy
- Epochs: 10
- Trained By: ruggsea
- Date: 2025-03-05
Model Description
This model is specialized in generating Italian poetry in the style of Dante Alighieri's Divine Comedy. It has been trained to:
- Generate proper endecasillabi (11-syllable lines)
- Follow the structure of Dante's poetry
- Avoid repetition
- Create original content (not plagiarize the Divine Comedy)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("ruggsea/dante-zero-2025-03-05")
tokenizer = AutoTokenizer.from_pretrained("ruggsea/dante-zero-2025-03-05")
# Generate poetry
prompt = "Nel mezzo del cammin di nostra vita"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
inputs.input_ids,
max_new_tokens=200,
do_sample=True,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.2
)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
Reward Functions
The model was trained using several reward functions:
- Endecasillabo Checker: Rewards proper 11-syllable lines
- Plagiarism Checker: Penalizes copying from the Divine Comedy
- Verse Structure Checker: Encourages verse-like structure
- Repetition Penalty: Discourages repetitive patterns
License
This model is available under the same license as the base model (PleIAs/Pleias-350m-Preview).
- Downloads last month
- 10
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for ruggsea/dante-zero-2025-03-05
Base model
PleIAs/Pleias-350m-Preview