Mistral-7B-Instruct-v0.2-GPTQ_retrained_IoV

This model is a fine-tuned version of TheBloke/Mistral-7B-Instruct-v0.2-GPTQ on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.8239

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 4
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 20
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
4.1739	0.9412	4	3.4659
4.0196	1.8824	8	3.3864
3.2283	2.8235	12	3.2071
2.0147	4.0	17	2.9738
2.2299	4.9412	21	2.7751
2.0711	5.8824	25	2.6203
1.946	6.8235	29	2.3797
1.4262	8.0	34	2.0694
1.6468	8.9412	38	1.8468
1.5549	9.8824	42	1.5932
1.4661	10.8235	46	1.3819
1.1079	12.0	51	1.2555
1.3114	12.9412	55	1.1306
1.2436	13.8824	59	1.0515
1.1965	14.8235	63	0.9581
0.9269	16.0	68	0.9277
1.1262	16.9412	72	0.8709
1.1054	17.8824	76	0.8343
0.8664	18.8235	80	0.8239

Framework versions

PEFT 0.10.0
Transformers 4.40.2
Pytorch 2.1.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

rnaveensrinivas
/

Mistral-7B-Instruct-v0.2-GPTQ_retrained_IoV

Mistral-7B-Instruct-v0.2-GPTQ_retrained_IoV

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rnaveensrinivas/Mistral-7B-Instruct-v0.2-GPTQ_retrained_IoV

Evaluation results