Jiacheng Guo
Enable LFS for large files and add changes
29efa73
metadata
library_name: transformers
base_model: /scratch/gpfs/jg9904/saved_models/Mistral-7B-Instruct-v0.3
tags:
  - alignment-handbook
  - generated_from_trainer
datasets:
  - >-
    /scratch/gpfs/jg9904/unintentional-unalignment/data_files/data-mistral-7b-instruct-sppo-iter1/50_new
model-index:
  - name: mistral-dpo-lr-5.0e-7-beta-0.01
    results: []

mistral-dpo-lr-5.0e-7-beta-0.01

This model is a fine-tuned version of /scratch/gpfs/jg9904/saved_models/Mistral-7B-Instruct-v0.3 on the /scratch/gpfs/jg9904/unintentional-unalignment/data_files/data-mistral-7b-instruct-sppo-iter1/50_new dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4740
  • Rewards/chosen: -0.4546
  • Rewards/rejected: -1.1932
  • Rewards/accuracies: 0.8036
  • Rewards/margins: 0.7386
  • Logps/rejected: -459.2464
  • Logps/chosen: -346.3625
  • Logits/rejected All: -2.7774
  • Logits/chosen All: -2.7702
  • Logits/rejected Sum: 8023.3535
  • Logits/chosen Sum: 8554.5498
  • Logits/rejected Avg: 21.6078
  • Logits/chosen Avg: 21.0986
  • Gradient/inner Product: 463470592.0
  • Gradient/nabla Chosen Logps: 28288.0
  • Gradient/nabla Rejected Logps: 37632.0
  • Gradient/correlation: 0.4004

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 64
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected All Logits/chosen All Logits/rejected Sum Logits/chosen Sum Logits/rejected Avg Logits/chosen Avg Gradient/inner Product Gradient/nabla Chosen Logps Gradient/nabla Rejected Logps Gradient/correlation
No log 0 0 0.6931 0.0 0.0 0.0 0.0 -339.9275 -300.9012 -2.8672 -2.8605 7351.9551 7878.5537 19.8359 19.5574 86507520.0 16384.0 17152.0 0.2451
0.667 0.6803 100 0.4740 -0.4546 -1.1932 0.8036 0.7386 -459.2464 -346.3625 -2.7774 -2.7702 8023.3535 8554.5498 21.6078 21.0986 463470592.0 28288.0 37632.0 0.4004

Framework versions

  • Transformers 4.45.0
  • Pytorch 2.5.1+cu124
  • Datasets 2.14.6
  • Tokenizers 0.20.4