File size: 2,776 Bytes
768c73d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
license: apache-2.0
base_model: google/mt5-small
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: spell_corrector_small_v7
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# spell_corrector_small_v7

This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5549
- Bleu: 34.7876
- Gen Len: 15.7815

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
| 14.2184       | 1.0   | 976   | 1.6501          | 16.2132 | 13.5571 |
| 2.8018        | 2.0   | 1952  | 1.2055          | 23.1195 | 15.9748 |
| 2.0238        | 3.0   | 2928  | 0.9646          | 26.7454 | 15.9865 |
| 1.6928        | 4.0   | 3904  | 0.8372          | 28.6482 | 15.9601 |
| 1.4888        | 5.0   | 4880  | 0.7906          | 29.6306 | 15.9221 |
| 1.3855        | 6.0   | 5856  | 0.7393          | 30.3841 | 15.9006 |
| 1.2999        | 7.0   | 6832  | 0.7029          | 31.2225 | 15.8612 |
| 1.2379        | 8.0   | 7808  | 0.6794          | 31.6015 | 15.8666 |
| 1.1709        | 9.0   | 8784  | 0.6572          | 32.2153 | 15.8512 |
| 1.1433        | 10.0  | 9760  | 0.6303          | 32.7529 | 15.8288 |
| 1.1248        | 11.0  | 10736 | 0.6184          | 33.144  | 15.8244 |
| 1.0703        | 12.0  | 11712 | 0.6072          | 33.4743 | 15.8121 |
| 1.0547        | 13.0  | 12688 | 0.5937          | 33.7492 | 15.8139 |
| 1.0275        | 14.0  | 13664 | 0.5779          | 34.1454 | 15.7952 |
| 1.0122        | 15.0  | 14640 | 0.5727          | 34.2908 | 15.7907 |
| 1.0071        | 16.0  | 15616 | 0.5662          | 34.4457 | 15.7874 |
| 1.0017        | 17.0  | 16592 | 0.5609          | 34.6225 | 15.7847 |
| 0.9879        | 18.0  | 17568 | 0.5575          | 34.6937 | 15.7832 |
| 0.9814        | 19.0  | 18544 | 0.5554          | 34.7827 | 15.7816 |
| 0.9793        | 20.0  | 19520 | 0.5549          | 34.7876 | 15.7815 |


### Framework versions

- Transformers 4.31.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.2
- Tokenizers 0.13.3