Mistral-7B-v0.3_pct_reverse

This model is a fine-tuned version of unsloth/mistral-7b-v0.3-bnb-4bit on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 6.8605

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.02
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
2.1177	0.0206	8	2.6702
8.9887	0.0413	16	9.0083
7.777	0.0619	24	7.6913
7.6327	0.0825	32	7.6181
7.6585	0.1032	40	7.6409
7.6813	0.1238	48	7.5593
7.6016	0.1444	56	7.5868
7.5595	0.1651	64	7.5960
7.7069	0.1857	72	7.5984
7.6285	0.2063	80	7.4589
7.5374	0.2270	88	7.4251
7.4161	0.2476	96	7.3111
7.3713	0.2682	104	7.2864
7.2921	0.2888	112	7.2224
7.2529	0.3095	120	7.1938
7.3559	0.3301	128	7.1139
7.1657	0.3507	136	7.0930
7.066	0.3714	144	7.0315
7.1481	0.3920	152	7.0332
7.0394	0.4126	160	7.0583
7.0685	0.4333	168	7.0682
6.9791	0.4539	176	6.9472
7.1428	0.4745	184	7.0126
7.1661	0.4952	192	6.9513
6.9757	0.5158	200	7.0717
6.9685	0.5364	208	6.9399
7.0811	0.5571	216	6.8879
7.0126	0.5777	224	6.9264
6.9712	0.5983	232	6.8394
6.9533	0.6190	240	6.9073
6.9744	0.6396	248	6.9239
7.1531	0.6602	256	6.9109
6.9527	0.6809	264	6.8941
7.1027	0.7015	272	6.9498
7.1718	0.7221	280	6.9495
7.0877	0.7427	288	6.9761
6.9879	0.7634	296	6.9905
6.9813	0.7840	304	6.9238
7.0798	0.8046	312	6.8707
7.0531	0.8253	320	6.8658
7.0518	0.8459	328	6.8576
7.127	0.8665	336	6.9017
6.9259	0.8872	344	6.8581
6.9477	0.9078	352	6.8727
7.0367	0.9284	360	6.8629
6.9114	0.9491	368	6.8469
7.0537	0.9697	376	6.8627
6.9656	0.9903	384	6.8605

Framework versions

PEFT 0.12.0
Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

imdatta0
/

Mistral-7B-v0.3_pct_reverse

Mistral-7B-v0.3_pct_reverse

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for imdatta0/Mistral-7B-v0.3_pct_reverse

Evaluation results