metadata

license: apache-2.0
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
tags:
  - generated_from_trainer
model-index:
  - name: myBit-Llama2-jp-127M-test-1
    results: []

myBit-Llama2-jp-127M-test-1

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 10.6136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0024
train_batch_size: 48
eval_batch_size: 48
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: polynomial
lr_scheduler_warmup_steps: 500
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
7.3834	0.04	200	6.5397
6.8679	0.07	400	9.8549
10.2542	0.11	600	10.3642
10.3959	0.14	800	10.4168
10.4303	0.18	1000	10.4429
10.4527	0.22	1200	10.4638
10.4744	0.25	1400	10.4837
10.4907	0.29	1600	10.4981
10.5032	0.32	1800	10.5069
10.5134	0.36	2000	10.5165
10.5208	0.4	2200	10.5264
10.5284	0.43	2400	10.5324
10.535	0.47	2600	10.5372
10.541	0.51	2800	10.5445
10.5472	0.54	3000	10.5498
10.5532	0.58	3200	10.5561
10.5588	0.61	3400	10.5614
10.5647	0.65	3600	10.5672
10.5698	0.69	3800	10.5727
10.5753	0.72	4000	10.5760
10.5809	0.76	4200	10.5834
10.5864	0.79	4400	10.5892
10.5919	0.83	4600	10.5946
10.5971	0.87	4800	10.5995
10.6027	0.9	5000	10.6047
10.6076	0.94	5200	10.6105
10.6126	0.97	5400	10.6136

Framework versions

Transformers 4.38.2
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.15.2