lilferrit
/

ft-wmt14-5

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Edit model card

ft-wmt14-5

This model is a fine-tuned version of google/mt5-small on the lilferrit/wmt14-short dataset. It achieves the following results on the evaluation set:

Loss: 2.0604
Bleu: 20.7584
Gen Len: 30.499

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adafactor
lr_scheduler_type: constant
training_steps: 100000

Training results

Training Loss	Epoch	Step	Bleu	Gen Len	Validation Loss
1.9166	0.2778	10000	15.8119	32.097	2.3105
1.7184	0.5556	20000	17.5903	31.1153	2.1993
1.6061	0.8333	30000	18.9604	30.327	2.1380
1.516	1.1111	40000	19.1444	30.2727	2.1366
1.4675	1.3889	50000	19.7588	30.1127	2.1208
1.4416	1.6667	60000	19.9263	30.4463	2.0889
1.4111	1.9444	70000	2.0795	20.3323	30.1207
1.3603	2.2222	80000	2.0850	20.5373	30.5943
1.3378	2.5	90000	2.0604	20.7584	30.499
1.3381	2.7778	100000	2.0597	20.6113	30.701

Framework versions

Transformers 4.40.0
Pytorch 2.2.2+cu121
Datasets 2.19.0
Tokenizers 0.19.1

Downloads last month: 4

Safetensors

Model size

300M params

Tensor type

F32

·

Inference Examples

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for lilferrit/ft-wmt14-5

Base model

google/mt5-small

Finetuned

(302)

this model

Dataset used to train lilferrit/ft-wmt14-5

Evaluation results

Bleu on lilferrit/wmt14-short
self-reported

20.758

View on Papers With Code