|
--- |
|
license: llama2 |
|
language: |
|
- hu |
|
--- |
|
|
|
Base Model: |
|
https://huggingface.co/meta-llama/Llama-2-7b-chat-hf |
|
|
|
--- |
|
|
|
Model fine-tuned on a real news dataset and optimized for neural news generation. |
|
|
|
Note: Hungarian was not in pretraining. |
|
|
|
```python |
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline |
|
|
|
# Load model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf") |
|
model = AutoModelForSequenceClassification.from_pretrained('tum-nlp/neural-news-llama-2-7b-chat-hu') |
|
|
|
# Create the pipeline for neural news generation and set the repetition penalty >1.1 to punish repetition. |
|
generator = pipeline('text-generation', |
|
model=model, |
|
tokenizer=tokenizer, |
|
repetition_penalty=1.2) |
|
|
|
# Define the prompt |
|
prompt = "Cím: Ellenzéki politikai akció az ügyészséggel szemben Cikk: Az ügyészség visszautasítja az igazságszolgáltatást ért politikai nyomásgyakorlást – tájékoztatott [EOP]" |
|
|
|
# Generate |
|
generator(prompt, max_length=1000, num_return_sequences=1) |
|
|
|
``` |
|
|
|
Trained on 6k datapoints (including all splits) from: |
|
https://github.com/batubayk/news_datasets |
|
|