HachiML's picture
End of training
f190d92 verified
|
raw
history blame
No virus
2.72 kB
metadata
license: apache-2.0
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
tags:
  - generated_from_trainer
model-index:
  - name: myBit-Llama2-jp-127M-test-1
    results: []

myBit-Llama2-jp-127M-test-1

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 10.6136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0024
  • train_batch_size: 48
  • eval_batch_size: 48
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
7.3834 0.04 200 6.5397
6.8679 0.07 400 9.8549
10.2542 0.11 600 10.3642
10.3959 0.14 800 10.4168
10.4303 0.18 1000 10.4429
10.4527 0.22 1200 10.4638
10.4744 0.25 1400 10.4837
10.4907 0.29 1600 10.4981
10.5032 0.32 1800 10.5069
10.5134 0.36 2000 10.5165
10.5208 0.4 2200 10.5264
10.5284 0.43 2400 10.5324
10.535 0.47 2600 10.5372
10.541 0.51 2800 10.5445
10.5472 0.54 3000 10.5498
10.5532 0.58 3200 10.5561
10.5588 0.61 3400 10.5614
10.5647 0.65 3600 10.5672
10.5698 0.69 3800 10.5727
10.5753 0.72 4000 10.5760
10.5809 0.76 4200 10.5834
10.5864 0.79 4400 10.5892
10.5919 0.83 4600 10.5946
10.5971 0.87 4800 10.5995
10.6027 0.9 5000 10.6047
10.6076 0.94 5200 10.6105
10.6126 0.97 5400 10.6136

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2