--- library_name: transformers license: llama3 base_model: meta-llama/Meta-Llama-3-8B tags: - generated_from_trainer model-index: - name: tfa_output_2025_m02_d07_t07h_43m_33s results: [] --- # tfa_output_2025_m02_d07_t07h_43m_33s This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.4660 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-06 - train_batch_size: 2 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 8 - optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: constant_with_warmup - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | No log | 0 | 0 | 1.4694 | | 4.604 | 0.0030 | 1 | 1.4694 | | 4.6485 | 0.0060 | 2 | 1.4693 | | 4.6515 | 0.0091 | 3 | 1.4693 | | 4.9659 | 0.0121 | 4 | 1.4692 | | 4.5235 | 0.0151 | 5 | 1.4692 | | 4.7107 | 0.0181 | 6 | 1.4690 | | 4.3335 | 0.0211 | 7 | 1.4686 | | 4.8938 | 0.0242 | 8 | 1.4682 | | 4.8609 | 0.0272 | 9 | 1.4677 | | 4.5648 | 0.0302 | 10 | 1.4667 | | 4.5394 | 0.0332 | 11 | 1.4655 | | 4.7629 | 0.0363 | 12 | 1.4644 | | 4.5808 | 0.0393 | 13 | 1.4632 | | 4.5545 | 0.0423 | 14 | 1.4619 | | 4.4343 | 0.0453 | 15 | 1.4605 | | 4.5662 | 0.0483 | 16 | 1.4591 | | 4.398 | 0.0514 | 17 | 1.4575 | | 4.3894 | 0.0544 | 18 | 1.4550 | | 4.61 | 0.0574 | 19 | 1.4524 | | 4.4373 | 0.0604 | 20 | 1.4501 | | 4.2311 | 0.0634 | 21 | 1.4478 | | 4.3044 | 0.0665 | 22 | 1.4454 | | 4.2496 | 0.0695 | 23 | 1.4431 | | 4.3269 | 0.0725 | 24 | 1.4409 | | 4.2602 | 0.0755 | 25 | 1.4385 | | 4.4063 | 0.0785 | 26 | 1.4362 | | 4.2922 | 0.0816 | 27 | 1.4341 | | 3.8772 | 0.0846 | 28 | 1.4320 | | 4.2066 | 0.0876 | 29 | 1.4298 | | 4.1774 | 0.0906 | 30 | 1.4280 | | 4.1788 | 0.0937 | 31 | 1.4259 | | 4.2413 | 0.0967 | 32 | 1.4240 | | 4.2196 | 0.0997 | 33 | 1.4220 | | 4.4204 | 0.1027 | 34 | 1.4203 | | 4.1017 | 0.1057 | 35 | 1.4185 | | 4.25 | 0.1088 | 36 | 1.4168 | | 4.1705 | 0.1118 | 37 | 1.4152 | | 4.0192 | 0.1148 | 38 | 1.4136 | | 4.1317 | 0.1178 | 39 | 1.4123 | | 4.243 | 0.1208 | 40 | 1.4107 | | 3.9988 | 0.1239 | 41 | 1.4094 | | 3.9451 | 0.1269 | 42 | 1.4081 | | 4.0244 | 0.1299 | 43 | 1.4068 | | 4.0956 | 0.1329 | 44 | 1.4055 | | 3.9313 | 0.1360 | 45 | 1.4043 | | 3.9223 | 0.1390 | 46 | 1.4033 | | 3.7763 | 0.1420 | 47 | 1.4021 | | 3.8153 | 0.1450 | 48 | 1.4011 | | 4.1018 | 0.1480 | 49 | 1.4003 | | 4.2674 | 0.1511 | 50 | 1.3994 | | 4.0947 | 0.1541 | 51 | 1.3984 | | 3.8134 | 0.1571 | 52 | 1.3976 | | 3.7836 | 0.1601 | 53 | 1.3969 | | 3.9645 | 0.1631 | 54 | 1.3961 | | 3.5071 | 0.1662 | 55 | 1.3955 | | 3.9643 | 0.1692 | 56 | 1.3950 | | 3.7689 | 0.1722 | 57 | 1.3944 | | 3.9377 | 0.1752 | 58 | 1.3937 | | 3.9542 | 0.1782 | 59 | 1.3932 | | 3.7511 | 0.1813 | 60 | 1.3926 | | 3.7859 | 0.1843 | 61 | 1.3922 | | 3.7963 | 0.1873 | 62 | 1.3916 | | 3.7728 | 0.1903 | 63 | 1.3913 | | 3.7714 | 0.1934 | 64 | 1.3910 | | 3.9066 | 0.1964 | 65 | 1.3906 | | 3.9718 | 0.1994 | 66 | 1.3903 | | 3.7207 | 0.2024 | 67 | 1.3901 | | 3.7618 | 0.2054 | 68 | 1.3896 | | 3.5737 | 0.2085 | 69 | 1.3895 | | 3.6742 | 0.2115 | 70 | 1.3892 | | 3.6627 | 0.2145 | 71 | 1.3892 | | 3.7246 | 0.2175 | 72 | 1.3889 | | 3.5988 | 0.2205 | 73 | 1.3887 | | 3.5042 | 0.2236 | 74 | 1.3885 | | 3.6754 | 0.2266 | 75 | 1.3884 | | 3.8237 | 0.2296 | 76 | 1.3883 | | 3.6541 | 0.2326 | 77 | 1.3882 | | 3.8442 | 0.2356 | 78 | 1.3881 | | 3.8189 | 0.2387 | 79 | 1.3881 | | 3.4796 | 0.2417 | 80 | 1.3879 | | 3.7061 | 0.2447 | 81 | 1.3880 | | 3.7453 | 0.2477 | 82 | 1.3880 | | 3.4375 | 0.2508 | 83 | 1.3880 | | 3.6748 | 0.2538 | 84 | 1.3881 | | 3.6132 | 0.2568 | 85 | 1.3880 | | 3.6022 | 0.2598 | 86 | 1.3879 | | 3.9084 | 0.2628 | 87 | 1.3879 | | 3.5629 | 0.2659 | 88 | 1.3882 | | 3.6004 | 0.2689 | 89 | 1.3883 | | 3.8498 | 0.2719 | 90 | 1.3883 | | 3.5523 | 0.2749 | 91 | 1.3884 | | 3.7526 | 0.2779 | 92 | 1.3886 | | 3.7638 | 0.2810 | 93 | 1.3887 | | 3.6319 | 0.2840 | 94 | 1.3888 | | 3.551 | 0.2870 | 95 | 1.3888 | | 3.8053 | 0.2900 | 96 | 1.3890 | | 3.6299 | 0.2931 | 97 | 1.3891 | | 3.8778 | 0.2961 | 98 | 1.3891 | | 3.4661 | 0.2991 | 99 | 1.3893 | | 3.6199 | 0.3021 | 100 | 1.3894 | | 3.7169 | 0.3051 | 101 | 1.3897 | | 3.6181 | 0.3082 | 102 | 1.3898 | | 3.712 | 0.3112 | 103 | 1.3900 | | 3.426 | 0.3142 | 104 | 1.3903 | | 3.2462 | 0.3172 | 105 | 1.3905 | | 3.4656 | 0.3202 | 106 | 1.3907 | | 3.5511 | 0.3233 | 107 | 1.3910 | | 3.5268 | 0.3263 | 108 | 1.3913 | | 3.4383 | 0.3293 | 109 | 1.3915 | | 3.5351 | 0.3323 | 110 | 1.3919 | | 3.4221 | 0.3353 | 111 | 1.3922 | | 3.064 | 0.3384 | 112 | 1.3926 | | 3.4006 | 0.3414 | 113 | 1.3928 | | 3.5908 | 0.3444 | 114 | 1.3931 | | 3.5341 | 0.3474 | 115 | 1.3935 | | 3.4771 | 0.3505 | 116 | 1.3938 | | 3.4362 | 0.3535 | 117 | 1.3940 | | 3.5801 | 0.3565 | 118 | 1.3941 | | 3.5304 | 0.3595 | 119 | 1.3943 | | 3.6278 | 0.3625 | 120 | 1.3945 | | 3.5677 | 0.3656 | 121 | 1.3948 | | 3.5208 | 0.3686 | 122 | 1.3950 | | 3.5702 | 0.3716 | 123 | 1.3952 | | 3.4074 | 0.3746 | 124 | 1.3954 | | 3.1593 | 0.3776 | 125 | 1.3956 | | 3.4503 | 0.3807 | 126 | 1.3958 | | 3.6251 | 0.3837 | 127 | 1.3961 | | 3.4879 | 0.3867 | 128 | 1.3965 | | 3.3838 | 0.3897 | 129 | 1.3968 | | 3.411 | 0.3927 | 130 | 1.3972 | | 3.3504 | 0.3958 | 131 | 1.3975 | | 3.2542 | 0.3988 | 132 | 1.3980 | | 3.5616 | 0.4018 | 133 | 1.3984 | | 3.3994 | 0.4048 | 134 | 1.3990 | | 3.3805 | 0.4079 | 135 | 1.3994 | | 3.4415 | 0.4109 | 136 | 1.3997 | | 3.6288 | 0.4139 | 137 | 1.4001 | | 3.2875 | 0.4169 | 138 | 1.4004 | | 3.2699 | 0.4199 | 139 | 1.4007 | | 3.6125 | 0.4230 | 140 | 1.4011 | | 3.4689 | 0.4260 | 141 | 1.4013 | | 3.4061 | 0.4290 | 142 | 1.4015 | | 3.3196 | 0.4320 | 143 | 1.4019 | | 3.4037 | 0.4350 | 144 | 1.4020 | | 3.4214 | 0.4381 | 145 | 1.4024 | | 3.3448 | 0.4411 | 146 | 1.4025 | | 3.4775 | 0.4441 | 147 | 1.4028 | | 3.4709 | 0.4471 | 148 | 1.4032 | | 3.4283 | 0.4502 | 149 | 1.4033 | | 3.4671 | 0.4532 | 150 | 1.4035 | | 3.2426 | 0.4562 | 151 | 1.4038 | | 3.4191 | 0.4592 | 152 | 1.4042 | | 3.4264 | 0.4622 | 153 | 1.4045 | | 3.2461 | 0.4653 | 154 | 1.4050 | | 3.3855 | 0.4683 | 155 | 1.4053 | | 3.313 | 0.4713 | 156 | 1.4056 | | 3.3058 | 0.4743 | 157 | 1.4059 | | 3.5508 | 0.4773 | 158 | 1.4062 | | 3.2109 | 0.4804 | 159 | 1.4066 | | 3.5045 | 0.4834 | 160 | 1.4068 | | 3.4068 | 0.4864 | 161 | 1.4071 | | 3.3438 | 0.4894 | 162 | 1.4075 | | 3.3953 | 0.4924 | 163 | 1.4078 | | 3.2312 | 0.4955 | 164 | 1.4079 | | 3.2971 | 0.4985 | 165 | 1.4084 | | 3.3118 | 0.5015 | 166 | 1.4086 | | 3.4395 | 0.5045 | 167 | 1.4088 | | 3.7162 | 0.5076 | 168 | 1.4089 | | 3.3864 | 0.5106 | 169 | 1.4091 | | 3.0887 | 0.5136 | 170 | 1.4092 | | 2.9898 | 0.5166 | 171 | 1.4095 | | 3.4697 | 0.5196 | 172 | 1.4099 | | 3.2762 | 0.5227 | 173 | 1.4102 | | 3.1383 | 0.5257 | 174 | 1.4106 | | 3.2522 | 0.5287 | 175 | 1.4112 | | 3.309 | 0.5317 | 176 | 1.4117 | | 3.4431 | 0.5347 | 177 | 1.4121 | | 3.1366 | 0.5378 | 178 | 1.4126 | | 3.3094 | 0.5408 | 179 | 1.4131 | | 3.4466 | 0.5438 | 180 | 1.4136 | | 3.3411 | 0.5468 | 181 | 1.4140 | | 3.013 | 0.5498 | 182 | 1.4143 | | 3.4785 | 0.5529 | 183 | 1.4147 | | 3.0358 | 0.5559 | 184 | 1.4152 | | 3.2833 | 0.5589 | 185 | 1.4158 | | 3.2953 | 0.5619 | 186 | 1.4162 | | 3.3485 | 0.5650 | 187 | 1.4167 | | 3.4911 | 0.5680 | 188 | 1.4172 | | 3.3863 | 0.5710 | 189 | 1.4176 | | 3.1944 | 0.5740 | 190 | 1.4180 | | 3.2994 | 0.5770 | 191 | 1.4185 | | 3.4385 | 0.5801 | 192 | 1.4187 | | 3.3346 | 0.5831 | 193 | 1.4191 | | 3.3318 | 0.5861 | 194 | 1.4194 | | 3.4027 | 0.5891 | 195 | 1.4198 | | 3.2532 | 0.5921 | 196 | 1.4201 | | 3.2351 | 0.5952 | 197 | 1.4205 | | 3.4589 | 0.5982 | 198 | 1.4203 | | 3.4375 | 0.6012 | 199 | 1.4204 | | 3.0901 | 0.6042 | 200 | 1.4207 | | 3.3186 | 0.6073 | 201 | 1.4210 | | 3.2891 | 0.6103 | 202 | 1.4212 | | 3.2752 | 0.6133 | 203 | 1.4217 | | 3.3808 | 0.6163 | 204 | 1.4221 | | 3.376 | 0.6193 | 205 | 1.4223 | | 3.4086 | 0.6224 | 206 | 1.4226 | | 3.3506 | 0.6254 | 207 | 1.4228 | | 3.4508 | 0.6284 | 208 | 1.4232 | | 3.4237 | 0.6314 | 209 | 1.4235 | | 3.2154 | 0.6344 | 210 | 1.4240 | | 3.2379 | 0.6375 | 211 | 1.4244 | | 2.8335 | 0.6405 | 212 | 1.4251 | | 3.1927 | 0.6435 | 213 | 1.4251 | | 3.1871 | 0.6465 | 214 | 1.4256 | | 3.1004 | 0.6495 | 215 | 1.4259 | | 3.2405 | 0.6526 | 216 | 1.4264 | | 3.1544 | 0.6556 | 217 | 1.4269 | | 3.1204 | 0.6586 | 218 | 1.4274 | | 3.3257 | 0.6616 | 219 | 1.4280 | | 3.2689 | 0.6647 | 220 | 1.4286 | | 3.0117 | 0.6677 | 221 | 1.4289 | | 3.4276 | 0.6707 | 222 | 1.4295 | | 3.2358 | 0.6737 | 223 | 1.4301 | | 3.1374 | 0.6767 | 224 | 1.4307 | | 3.2972 | 0.6798 | 225 | 1.4312 | | 3.2838 | 0.6828 | 226 | 1.4318 | | 3.2839 | 0.6858 | 227 | 1.4322 | | 3.2228 | 0.6888 | 228 | 1.4328 | | 3.2605 | 0.6918 | 229 | 1.4334 | | 3.2945 | 0.6949 | 230 | 1.4340 | | 3.3155 | 0.6979 | 231 | 1.4344 | | 3.1988 | 0.7009 | 232 | 1.4349 | | 3.2921 | 0.7039 | 233 | 1.4355 | | 2.8752 | 0.7069 | 234 | 1.4359 | | 3.0065 | 0.7100 | 235 | 1.4361 | | 3.1689 | 0.7130 | 236 | 1.4366 | | 3.1959 | 0.7160 | 237 | 1.4370 | | 3.3473 | 0.7190 | 238 | 1.4373 | | 3.2927 | 0.7221 | 239 | 1.4377 | | 2.9934 | 0.7251 | 240 | 1.4379 | | 3.2058 | 0.7281 | 241 | 1.4384 | | 3.1388 | 0.7311 | 242 | 1.4388 | | 3.2384 | 0.7341 | 243 | 1.4388 | | 3.2028 | 0.7372 | 244 | 1.4392 | | 3.3737 | 0.7402 | 245 | 1.4392 | | 3.166 | 0.7432 | 246 | 1.4397 | | 3.0255 | 0.7462 | 247 | 1.4397 | | 3.0979 | 0.7492 | 248 | 1.4401 | | 3.2436 | 0.7523 | 249 | 1.4404 | | 3.1785 | 0.7553 | 250 | 1.4408 | | 3.2052 | 0.7583 | 251 | 1.4411 | | 3.1967 | 0.7613 | 252 | 1.4413 | | 2.9086 | 0.7644 | 253 | 1.4416 | | 3.355 | 0.7674 | 254 | 1.4421 | | 3.4027 | 0.7704 | 255 | 1.4422 | | 2.9307 | 0.7734 | 256 | 1.4428 | | 3.1738 | 0.7764 | 257 | 1.4429 | | 3.3088 | 0.7795 | 258 | 1.4431 | | 3.4942 | 0.7825 | 259 | 1.4432 | | 2.831 | 0.7855 | 260 | 1.4437 | | 3.1675 | 0.7885 | 261 | 1.4444 | | 3.3274 | 0.7915 | 262 | 1.4447 | | 3.0326 | 0.7946 | 263 | 1.4448 | | 3.3138 | 0.7976 | 264 | 1.4454 | | 3.2153 | 0.8006 | 265 | 1.4455 | | 3.3983 | 0.8036 | 266 | 1.4458 | | 3.179 | 0.8066 | 267 | 1.4461 | | 3.2621 | 0.8097 | 268 | 1.4464 | | 3.0191 | 0.8127 | 269 | 1.4468 | | 3.058 | 0.8157 | 270 | 1.4472 | | 3.3188 | 0.8187 | 271 | 1.4478 | | 2.9837 | 0.8218 | 272 | 1.4481 | | 3.2624 | 0.8248 | 273 | 1.4486 | | 3.2701 | 0.8278 | 274 | 1.4492 | | 3.1579 | 0.8308 | 275 | 1.4497 | | 3.3164 | 0.8338 | 276 | 1.4501 | | 2.9827 | 0.8369 | 277 | 1.4507 | | 3.1842 | 0.8399 | 278 | 1.4512 | | 3.2366 | 0.8429 | 279 | 1.4519 | | 3.0562 | 0.8459 | 280 | 1.4520 | | 2.9503 | 0.8489 | 281 | 1.4526 | | 3.0441 | 0.8520 | 282 | 1.4530 | | 3.4535 | 0.8550 | 283 | 1.4534 | | 3.2656 | 0.8580 | 284 | 1.4537 | | 3.3452 | 0.8610 | 285 | 1.4541 | | 3.0958 | 0.8640 | 286 | 1.4548 | | 3.1579 | 0.8671 | 287 | 1.4553 | | 3.1473 | 0.8701 | 288 | 1.4556 | | 3.2825 | 0.8731 | 289 | 1.4559 | | 2.8554 | 0.8761 | 290 | 1.4563 | | 3.2792 | 0.8792 | 291 | 1.4566 | | 3.0977 | 0.8822 | 292 | 1.4567 | | 3.0414 | 0.8852 | 293 | 1.4568 | | 3.2151 | 0.8882 | 294 | 1.4569 | | 3.1287 | 0.8912 | 295 | 1.4571 | | 3.1167 | 0.8943 | 296 | 1.4572 | | 3.1497 | 0.8973 | 297 | 1.4574 | | 3.0451 | 0.9003 | 298 | 1.4576 | | 3.147 | 0.9033 | 299 | 1.4578 | | 3.2183 | 0.9063 | 300 | 1.4582 | | 3.0974 | 0.9094 | 301 | 1.4585 | | 3.1824 | 0.9124 | 302 | 1.4589 | | 3.0607 | 0.9154 | 303 | 1.4593 | | 3.1255 | 0.9184 | 304 | 1.4598 | | 2.6534 | 0.9215 | 305 | 1.4604 | | 2.9006 | 0.9245 | 306 | 1.4605 | | 3.336 | 0.9275 | 307 | 1.4609 | | 3.2408 | 0.9305 | 308 | 1.4609 | | 3.0551 | 0.9335 | 309 | 1.4610 | | 2.8721 | 0.9366 | 310 | 1.4610 | | 3.1009 | 0.9396 | 311 | 1.4610 | | 3.3979 | 0.9426 | 312 | 1.4608 | | 3.133 | 0.9456 | 313 | 1.4609 | | 3.1008 | 0.9486 | 314 | 1.4609 | | 3.2113 | 0.9517 | 315 | 1.4610 | | 3.161 | 0.9547 | 316 | 1.4612 | | 2.968 | 0.9577 | 317 | 1.4614 | | 2.936 | 0.9607 | 318 | 1.4619 | | 3.4561 | 0.9637 | 319 | 1.4622 | | 3.1529 | 0.9668 | 320 | 1.4625 | | 3.1159 | 0.9698 | 321 | 1.4629 | | 3.2588 | 0.9728 | 322 | 1.4630 | | 2.9729 | 0.9758 | 323 | 1.4633 | | 3.2778 | 0.9789 | 324 | 1.4636 | | 2.9019 | 0.9819 | 325 | 1.4638 | | 3.094 | 0.9849 | 326 | 1.4643 | | 3.0259 | 0.9879 | 327 | 1.4647 | | 3.3842 | 0.9909 | 328 | 1.4652 | | 3.217 | 0.9940 | 329 | 1.4655 | | 3.4145 | 0.9970 | 330 | 1.4658 | | 3.3328 | 1.0 | 331 | 1.4660 | ### Framework versions - Transformers 4.48.0 - Pytorch 2.5.1+cu124 - Datasets 3.2.0 - Tokenizers 0.21.0