wikipedia_30

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	2.3378	2000	6.9004
7.3814	4.6756	4000	6.9259
7.3814	7.0134	6000	6.9098
6.845	9.3513	8000	6.6689
6.845	11.6891	10000	5.7875
5.8652	14.0269	12000	5.1923
5.8652	16.3647	14000	4.8226
4.8065	18.7025	16000	4.5346
4.8065	21.0403	18000	4.3246
4.2309	23.3781	20000	4.1348
4.2309	25.7160	22000	3.9652
3.8185	28.0538	24000	3.8108
3.8185	30.3916	26000	3.7102
3.5163	32.7294	28000	3.6271
3.5163	35.0672	30000	3.5350
3.2957	37.4050	32000	3.5053
3.2957	39.7428	34000	3.4144
3.1388	42.0807	36000	3.3632
3.1388	44.4185	38000	3.3095
3.0197	46.7563	40000	3.3381
3.0197	49.0941	42000	3.3036
2.9398	51.4319	44000	3.2828
2.9398	53.7697	46000	3.2407
2.8775	56.1075	48000	3.2374
2.8775	58.4454	50000	3.2790
2.8378	60.7832	52000	3.1918
2.8378	63.1210	54000	3.1904
2.8089	65.4588	56000	3.1705
2.8089	67.7966	58000	3.1829
2.7826	70.1344	60000	3.2242
2.7826	72.4722	62000	3.1731