shakhizat commited on
Commit
09d0ebf
·
verified ·
1 Parent(s): 72f1ff5

Model save

Browse files
Files changed (1) hide show
  1. README.md +15 -15
README.md CHANGED
@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 1.1622
22
 
23
  ## Model description
24
 
@@ -38,11 +38,11 @@ More information needed
38
 
39
  The following hyperparameters were used during training:
40
  - learning_rate: 0.0002
41
- - train_batch_size: 4
42
- - eval_batch_size: 4
43
  - seed: 42
44
- - gradient_accumulation_steps: 6
45
- - total_train_batch_size: 24
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
  - lr_scheduler_warmup_ratio: 0.01
@@ -52,16 +52,16 @@ The following hyperparameters were used during training:
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:------:|:----:|:---------------:|
55
- | No log | 0.0244 | 10 | 1.2046 |
56
- | No log | 0.0487 | 20 | 1.1875 |
57
- | 1.1978 | 0.0731 | 30 | 1.1797 |
58
- | 1.1978 | 0.0975 | 40 | 1.1763 |
59
- | 1.1462 | 0.1219 | 50 | 1.1736 |
60
- | 1.1462 | 0.1462 | 60 | 1.1712 |
61
- | 1.1462 | 0.1706 | 70 | 1.1701 |
62
- | 1.137 | 0.1950 | 80 | 1.1681 |
63
- | 1.137 | 0.2193 | 90 | 1.1645 |
64
- | 1.1484 | 0.2437 | 100 | 1.1622 |
65
 
66
 
67
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 1.1909
22
 
23
  ## Model description
24
 
 
38
 
39
  The following hyperparameters were used during training:
40
  - learning_rate: 0.0002
41
+ - train_batch_size: 1
42
+ - eval_batch_size: 2
43
  - seed: 42
44
+ - gradient_accumulation_steps: 2
45
+ - total_train_batch_size: 2
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
  - lr_scheduler_warmup_ratio: 0.01
 
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:------:|:----:|:---------------:|
55
+ | No log | 0.0020 | 10 | 1.2457 |
56
+ | No log | 0.0041 | 20 | 1.2098 |
57
+ | 1.3368 | 0.0061 | 30 | 1.2134 |
58
+ | 1.3368 | 0.0081 | 40 | 1.2185 |
59
+ | 1.2308 | 0.0102 | 50 | 1.2187 |
60
+ | 1.2308 | 0.0122 | 60 | 1.2190 |
61
+ | 1.2308 | 0.0142 | 70 | 1.2074 |
62
+ | 1.2921 | 0.0163 | 80 | 1.2018 |
63
+ | 1.2921 | 0.0183 | 90 | 1.1937 |
64
+ | 1.2655 | 0.0203 | 100 | 1.1909 |
65
 
66
 
67
  ### Framework versions