Configurations choice
Collection
Choice of configuration based on the results of different fine-tuning. All provide mor or less same results but 1 and 2 are way faster! (lr)
•
52 items
•
Updated
This model is a fine-tuned version of meta-llama/Meta-Llama-3.1-8B-Instruct on the GaetanMichelet/chat-60_ft_task-1 and the GaetanMichelet/chat-120_ft_task-1 datasets. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.3371 | 1.0 | 11 | 2.2386 |
1.7745 | 2.0 | 22 | 1.7577 |
1.2294 | 3.0 | 33 | 1.1298 |
0.9781 | 4.0 | 44 | 0.9825 |
0.8705 | 5.0 | 55 | 0.9252 |
0.8269 | 6.0 | 66 | 0.8844 |
0.6627 | 7.0 | 77 | 0.8656 |
0.6053 | 8.0 | 88 | 0.8689 |
0.5258 | 9.0 | 99 | 0.9307 |
0.3668 | 10.0 | 110 | 1.0742 |
0.248 | 11.0 | 121 | 1.2238 |
0.2039 | 12.0 | 132 | 1.3416 |
0.1357 | 13.0 | 143 | 1.4583 |
0.1016 | 14.0 | 154 | 1.5564 |
Base model
meta-llama/Llama-3.1-8B