testner

This model is a fine-tuned version of deepset/gbert-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.3944
Precision: 0.2579
Recall: 0.2364
F1: 0.2467
Accuracy: 0.8626

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.0	229	0.5531	0.0305	0.0133	0.0185	0.8440
No log	2.0	458	0.5008	0.1281	0.1055	0.1157	0.8582
0.5648	3.0	687	0.5616	0.1521	0.1648	0.1582	0.8532
0.5648	4.0	916	0.5269	0.1665	0.2315	0.1937	0.8536
0.2466	5.0	1145	0.6401	0.1885	0.2073	0.1975	0.8551
0.2466	6.0	1374	0.6759	0.1944	0.2036	0.1989	0.8592
0.1155	7.0	1603	0.7172	0.1859	0.2206	0.2018	0.8559
0.1155	8.0	1832	0.8176	0.2	0.2194	0.2092	0.8555
0.0612	9.0	2061	0.8450	0.1904	0.2315	0.2090	0.8519
0.0612	10.0	2290	0.9029	0.1895	0.2048	0.1969	0.8535
0.0376	11.0	2519	0.9917	0.2097	0.2194	0.2145	0.8548
0.0376	12.0	2748	0.9464	0.2346	0.2485	0.2413	0.8609
0.0376	13.0	2977	1.0170	0.2295	0.2412	0.2352	0.8585
0.022	14.0	3206	0.9993	0.2259	0.2242	0.2251	0.8590
0.022	15.0	3435	1.0762	0.2194	0.2473	0.2325	0.8528
0.0152	16.0	3664	1.0343	0.2434	0.2364	0.2399	0.8616
0.0152	17.0	3893	1.0420	0.2241	0.2388	0.2312	0.8570
0.0137	18.0	4122	1.1025	0.2214	0.2206	0.2210	0.8610
0.0137	19.0	4351	1.0975	0.2186	0.2339	0.2260	0.8540
0.0099	20.0	4580	1.1521	0.2281	0.2436	0.2356	0.8592
0.0099	21.0	4809	1.1143	0.2080	0.2461	0.2254	0.8527
0.0084	22.0	5038	1.2333	0.2368	0.24	0.2384	0.8567
0.0084	23.0	5267	1.1713	0.2367	0.2364	0.2365	0.8595
0.0084	24.0	5496	1.2162	0.2599	0.2315	0.2449	0.8643
0.0065	25.0	5725	1.1444	0.2467	0.2473	0.2470	0.8600
0.0065	26.0	5954	1.2645	0.2512	0.2545	0.2529	0.8617
0.0046	27.0	6183	1.2562	0.2252	0.2255	0.2253	0.8610
0.0046	28.0	6412	1.2663	0.2516	0.2327	0.2418	0.8615
0.0043	29.0	6641	1.2686	0.2565	0.2497	0.2531	0.8622
0.0043	30.0	6870	1.2411	0.2342	0.2521	0.2428	0.8586
0.0037	31.0	7099	1.2620	0.2553	0.2485	0.2518	0.8626
0.0037	32.0	7328	1.3049	0.2506	0.24	0.2452	0.8593
0.003	33.0	7557	1.2796	0.2516	0.2339	0.2425	0.8633
0.003	34.0	7786	1.3039	0.2484	0.2339	0.2409	0.8625
0.0025	35.0	8015	1.3241	0.2597	0.2436	0.2514	0.8618
0.0025	36.0	8244	1.3132	0.2475	0.2436	0.2456	0.8613
0.0025	37.0	8473	1.3445	0.25	0.2388	0.2443	0.8620
0.002	38.0	8702	1.3669	0.2556	0.2339	0.2443	0.8635
0.002	39.0	8931	1.3566	0.2623	0.2448	0.2533	0.8622
0.0018	40.0	9160	1.3300	0.2447	0.2388	0.2417	0.8620
0.0018	41.0	9389	1.3311	0.2397	0.24	0.2399	0.8624
0.0019	42.0	9618	1.3368	0.2469	0.2412	0.2440	0.8625
0.0019	43.0	9847	1.3701	0.2430	0.2412	0.2421	0.8624
0.0014	44.0	10076	1.3941	0.2286	0.2327	0.2306	0.8619
0.0014	45.0	10305	1.3842	0.2506	0.2352	0.2427	0.8628
0.0013	46.0	10534	1.3827	0.2443	0.2327	0.2384	0.8619
0.0013	47.0	10763	1.3730	0.2506	0.2376	0.2439	0.8632
0.0013	48.0	10992	1.3936	0.2586	0.2364	0.2470	0.8629
0.0011	49.0	11221	1.3941	0.2634	0.2388	0.2505	0.8627
0.0011	50.0	11450	1.3944	0.2579	0.2364	0.2467	0.8626

Framework versions

Transformers 4.48.1
Pytorch 2.5.1
Datasets 3.2.0
Tokenizers 0.21.0

MSLars
/

nonsense-detection

testner

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for MSLars/nonsense-detection

Evaluation results