bert-tiny-Massive-intent-KD-BERT

This model is a fine-tuned version of google/bert_uncased_L-2_H-128_A-2 on the massive dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8380
  • Accuracy: 0.8534

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 33
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
5.83 1.0 720 4.8826 0.3050
4.7602 2.0 1440 3.9904 0.4191
4.0301 3.0 2160 3.3806 0.5032
3.4797 4.0 2880 2.9065 0.5967
3.0352 5.0 3600 2.5389 0.6596
2.6787 6.0 4320 2.2342 0.7044
2.3644 7.0 5040 1.9873 0.7354
2.1145 8.0 5760 1.7928 0.7462
1.896 9.0 6480 1.6293 0.7644
1.7138 10.0 7200 1.5062 0.7752
1.5625 11.0 7920 1.3923 0.7885
1.4229 12.0 8640 1.3092 0.7978
1.308 13.0 9360 1.2364 0.8018
1.201 14.0 10080 1.1759 0.8155
1.1187 15.0 10800 1.1322 0.8214
1.0384 16.0 11520 1.0990 0.8234
0.976 17.0 12240 1.0615 0.8308
0.9163 18.0 12960 1.0377 0.8328
0.8611 19.0 13680 1.0054 0.8337
0.812 20.0 14400 0.9926 0.8367
0.7721 21.0 15120 0.9712 0.8382
0.7393 22.0 15840 0.9586 0.8357
0.7059 23.0 16560 0.9428 0.8372
0.6741 24.0 17280 0.9377 0.8396
0.6552 25.0 18000 0.9229 0.8377
0.627 26.0 18720 0.9100 0.8416
0.5972 27.0 19440 0.9028 0.8416
0.5784 28.0 20160 0.8996 0.8406
0.5595 29.0 20880 0.8833 0.8451
0.5438 30.0 21600 0.8772 0.8475
0.5218 31.0 22320 0.8758 0.8451
0.509 32.0 23040 0.8728 0.8480
0.4893 33.0 23760 0.8640 0.8480
0.4948 34.0 24480 0.8541 0.8475
0.4722 35.0 25200 0.8595 0.8495
0.468 36.0 25920 0.8488 0.8495
0.4517 37.0 26640 0.8460 0.8505
0.4462 38.0 27360 0.8450 0.8485
0.4396 39.0 28080 0.8422 0.8490
0.427 40.0 28800 0.8380 0.8534
0.4287 41.0 29520 0.8385 0.8480
0.4222 42.0 30240 0.8319 0.8510
0.421 43.0 30960 0.8296 0.8510

Framework versions

  • Transformers 4.22.1
  • Pytorch 1.12.1+cu113
  • Datasets 2.5.1
  • Tokenizers 0.12.1
Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results