aradia-ctc-distilhubert-ft

This model is a fine-tuned version of ntu-spml/distilhubert on the ABDUSAHMBZUAI/ARABIC_SPEECH_MASSIVE_SM - NA dataset. It achieves the following results on the evaluation set:

Loss: 2.7114
Wer: 0.8908

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 30.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
No log	0.43	100	4.4129	1.0
No log	0.87	200	3.5927	1.0
No log	1.3	300	3.3780	1.0
No log	1.74	400	3.0830	1.0
5.3551	2.17	500	2.6278	0.9999
5.3551	2.61	600	1.8359	1.0000
5.3551	3.04	700	1.7878	0.9914
5.3551	3.48	800	1.5219	0.9875
5.3551	3.91	900	1.4348	0.9879
1.7199	4.35	1000	1.4354	0.9644
1.7199	4.78	1100	1.5210	0.9519
1.7199	5.22	1200	1.3607	0.9475
1.7199	5.65	1300	1.3839	0.9343
1.7199	6.09	1400	1.2806	0.8944
1.2342	6.52	1500	1.3036	0.9011
1.2342	6.95	1600	1.3704	0.9072
1.2342	7.39	1700	1.2981	0.8891
1.2342	7.82	1800	1.2786	0.8733
1.2342	8.26	1900	1.2897	0.8867
0.9831	8.69	2000	1.4436	0.8780
0.9831	9.13	2100	1.3680	0.8873
0.9831	9.56	2200	1.3471	0.8692
0.9831	10.0	2300	1.3725	0.8729
0.9831	10.43	2400	1.4439	0.8771
0.8071	10.87	2500	1.5114	0.8928
0.8071	11.3	2600	1.6156	0.8958
0.8071	11.74	2700	1.4381	0.8749
0.8071	12.17	2800	1.5088	0.8717
0.8071	12.61	2900	1.5486	0.8813
0.6321	13.04	3000	1.4536	0.8884
0.6321	13.48	3100	1.4679	0.8947
0.6321	13.91	3200	1.5628	0.9117
0.6321	14.35	3300	1.5831	0.8716
0.6321	14.78	3400	1.6733	0.8702
0.4998	15.22	3500	1.8225	0.8665
0.4998	15.65	3600	1.8558	0.8732
0.4998	16.09	3700	1.7513	0.8766
0.4998	16.52	3800	1.8562	0.8753
0.4998	16.95	3900	1.9018	0.8704
0.4421	17.39	4000	1.9341	0.8789
0.4421	17.82	4100	1.9582	0.8781
0.4421	18.26	4200	1.8863	0.8821
0.4421	18.69	4300	1.9366	0.8847
0.4421	19.13	4400	2.1902	0.8721
0.3712	19.56	4500	2.1641	0.8670
0.3712	20.0	4600	2.1639	0.8776
0.3712	20.43	4700	2.2695	0.9030
0.3712	20.87	4800	2.1909	0.8937
0.3712	21.3	4900	2.1606	0.8959
0.3067	21.74	5000	2.1756	0.8943
0.3067	22.17	5100	2.4092	0.8773
0.3067	22.61	5200	2.4991	0.8721
0.3067	23.04	5300	2.3340	0.8910
0.3067	23.48	5400	2.3567	0.8946
0.2764	23.91	5500	2.3215	0.8897
0.2764	24.35	5600	2.4824	0.9002
0.2764	24.78	5700	2.4585	0.8963
0.2764	25.22	5800	2.5804	0.8879
0.2764	25.65	5900	2.5814	0.8903
0.2593	26.09	6000	2.5374	0.8868
0.2593	26.52	6100	2.5346	0.8922
0.2593	26.95	6200	2.5465	0.8873
0.2593	27.39	6300	2.6002	0.8919
0.2593	27.82	6400	2.6102	0.8928
0.227	28.26	6500	2.6925	0.8914
0.227	28.69	6600	2.6981	0.8913
0.227	29.13	6700	2.6872	0.8891
0.227	29.56	6800	2.7015	0.8897
0.227	30.0	6900	2.7114	0.8908

Framework versions

Transformers 4.18.0.dev0
Pytorch 1.10.2+cu113
Datasets 1.18.4
Tokenizers 0.11.6

abdusah
/

aradia-ctc-distilhubert-ft

aradia-ctc-distilhubert-ft

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results