T5_fine_tune / README.md
hieungo1410's picture
End of training
5908a4b verified
metadata
license: mit
base_model: VietAI/vit5-base
tags:
  - generated_from_trainer
model-index:
  - name: T5_fine_tune
    results: []

T5_fine_tune

This model is a fine-tuned version of VietAI/vit5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0000
  • Score: 42.9309
  • Counts: [2102, 1955, 1808, 1661]
  • Totals: [2107, 1960, 1813, 1666]
  • Precisions: [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079]
  • Bp: 0.4305
  • Sys Len: 2107
  • Ref Len: 3883

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Score Counts Totals Precisions Bp Sys Len Ref Len
No log 1.0 71 0.3075 34.0719 [1898, 1601, 1364, 1147] [2129, 1982, 1835, 1688] [89.14983560356976, 80.77699293642785, 74.33242506811989, 67.95023696682465] 0.4387 2129 3883
No log 2.0 142 0.1994 36.2148 [1925, 1676, 1483, 1299] [2111, 1964, 1817, 1670] [91.18900994789199, 85.33604887983707, 81.61805173362686, 77.78443113772455] 0.4320 2111 3883
No log 3.0 213 0.1269 39.1074 [2013, 1802, 1615, 1433] [2116, 1969, 1822, 1675] [95.13232514177693, 91.5185373285932, 88.63885839736554, 85.55223880597015] 0.4338 2116 3883
No log 4.0 284 0.0988 41.3658 [2074, 1879, 1696, 1517] [2152, 2005, 1858, 1711] [96.37546468401487, 93.71571072319202, 91.28094725511302, 88.66160140268849] 0.4474 2152 3883
No log 5.0 355 0.0777 41.4395 [2069, 1893, 1724, 1557] [2121, 1974, 1827, 1680] [97.54832626119754, 95.8966565349544, 94.36234263820471, 92.67857142857143] 0.4357 2121 3883
No log 6.0 426 0.0626 41.6053 [2074, 1902, 1742, 1582] [2108, 1961, 1814, 1667] [98.38709677419355, 96.99133095359511, 96.03087100330761, 94.90101979604079] 0.4308 2108 3883
No log 7.0 497 0.0534 42.1461 [2087, 1925, 1763, 1601] [2115, 1968, 1821, 1674] [98.67612293144208, 97.8150406504065, 96.81493684788578, 95.63918757467144] 0.4335 2115 3883
0.3652 8.0 568 0.0380 42.5312 [2096, 1936, 1781, 1625] [2116, 1969, 1822, 1675] [99.05482041587902, 98.32402234636872, 97.74972557628979, 97.01492537313433] 0.4338 2116 3883
0.3652 9.0 639 0.0361 42.3028 [2088, 1929, 1773, 1616] [2113, 1966, 1819, 1672] [98.81684808329389, 98.11800610376399, 97.47113798790544, 96.65071770334929] 0.4327 2113 3883
0.3652 10.0 710 0.0299 42.5382 [2097, 1942, 1789, 1635] [2106, 1959, 1812, 1665] [99.57264957264957, 99.13221031138336, 98.73068432671081, 98.1981981981982] 0.4301 2106 3883
0.3652 11.0 781 0.0319 42.5551 [2097, 1944, 1790, 1635] [2106, 1959, 1812, 1665] [99.57264957264957, 99.23430321592649, 98.78587196467991, 98.1981981981982] 0.4301 2106 3883
0.3652 12.0 852 0.0193 42.6258 [2095, 1942, 1790, 1638] [2111, 1964, 1817, 1670] [99.24206537186167, 98.87983706720978, 98.51403412217941, 98.08383233532935] 0.4320 2111 3883
0.3652 13.0 923 0.0178 42.7370 [2099, 1949, 1799, 1649] [2106, 1959, 1812, 1665] [99.667616334283, 99.48953547728433, 99.28256070640177, 99.03903903903904] 0.4301 2106 3883
0.3652 14.0 994 0.0156 42.5704 [2094, 1940, 1788, 1636] [2110, 1963, 1816, 1669] [99.24170616113744, 98.82832399388691, 98.45814977973568, 98.02276812462553] 0.4316 2110 3883
0.0716 15.0 1065 0.0120 42.7573 [2099, 1949, 1797, 1645] [2110, 1963, 1816, 1669] [99.47867298578198, 99.28680590932247, 98.95374449339207, 98.56201318154584] 0.4316 2110 3883
0.0716 16.0 1136 0.0094 42.8327 [2100, 1950, 1800, 1650] [2111, 1964, 1817, 1670] [99.4789199431549, 99.28716904276986, 99.06439185470556, 98.80239520958084] 0.4320 2111 3883
0.0716 17.0 1207 0.0064 42.7937 [2097, 1949, 1801, 1653] [2108, 1961, 1814, 1667] [99.47817836812145, 99.38806731259561, 99.28335170893054, 99.16016796640672] 0.4308 2108 3883
0.0716 18.0 1278 0.0077 42.9044 [2102, 1954, 1805, 1656] [2109, 1962, 1815, 1668] [99.66808914177335, 99.59225280326197, 99.44903581267218, 99.28057553956835] 0.4312 2109 3883
0.0716 19.0 1349 0.0073 42.9494 [2104, 1957, 1807, 1657] [2109, 1962, 1815, 1668] [99.76292081555239, 99.74515800203874, 99.55922865013774, 99.34052757793765] 0.4312 2109 3883
0.0716 20.0 1420 0.0071 42.9063 [2100, 1952, 1804, 1656] [2111, 1964, 1817, 1670] [99.4789199431549, 99.38900203665987, 99.28453494771601, 99.16167664670658] 0.4320 2111 3883
0.0716 21.0 1491 0.0031 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.032 22.0 1562 0.0076 42.8573 [2102, 1953, 1804, 1655] [2107, 1960, 1813, 1666] [99.76269577598481, 99.64285714285714, 99.50358521787093, 99.33973589435774] 0.4305 2107 3883
0.032 23.0 1633 0.0033 42.8941 [2102, 1954, 1806, 1658] [2107, 1960, 1813, 1666] [99.76269577598481, 99.6938775510204, 99.61389961389962, 99.51980792316927] 0.4305 2107 3883
0.032 24.0 1704 0.0016 42.8573 [2102, 1953, 1804, 1655] [2107, 1960, 1813, 1666] [99.76269577598481, 99.64285714285714, 99.50358521787093, 99.33973589435774] 0.4305 2107 3883
0.032 25.0 1775 0.0029 42.9087 [2102, 1954, 1806, 1658] [2108, 1961, 1814, 1667] [99.71537001897534, 99.64303926568077, 99.55898566703418, 99.46010797840432] 0.4308 2108 3883
0.032 26.0 1846 0.0030 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.032 27.0 1917 0.0021 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.032 28.0 1988 0.0014 42.9496 [2103, 1956, 1808, 1660] [2108, 1961, 1814, 1667] [99.76280834914611, 99.74502804691484, 99.66923925027564, 99.58008398320337] 0.4308 2108 3883
0.0174 29.0 2059 0.0009 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0174 30.0 2130 0.0009 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0174 31.0 2201 0.0007 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0174 32.0 2272 0.0011 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0174 33.0 2343 0.0024 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0174 34.0 2414 0.0022 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0174 35.0 2485 0.0006 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0109 36.0 2556 0.0003 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0109 37.0 2627 0.0001 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0109 38.0 2698 0.0005 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0109 39.0 2769 0.0003 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0109 40.0 2840 0.0012 42.9496 [2103, 1956, 1808, 1660] [2108, 1961, 1814, 1667] [99.76280834914611, 99.74502804691484, 99.66923925027564, 99.58008398320337] 0.4308 2108 3883
0.0109 41.0 2911 0.0002 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0109 42.0 2982 0.0003 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0073 43.0 3053 0.0001 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0073 44.0 3124 0.0000 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0073 45.0 3195 0.0001 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0073 46.0 3266 0.0000 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0073 47.0 3337 0.0000 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0073 48.0 3408 0.0000 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0073 49.0 3479 0.0000 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883
0.0047 50.0 3550 0.0000 42.9309 [2102, 1955, 1808, 1661] [2107, 1960, 1813, 1666] [99.76269577598481, 99.74489795918367, 99.72421400992829, 99.69987995198079] 0.4305 2107 3883

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.15.0