metadata

license: apache-2.0
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
tags:
  - generated_from_trainer
model-index:
  - name: BitNet-based-Llama2-jp-test
    results: []

BitNet-based-Llama2-jp-test

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 92.3872

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 156
eval_batch_size: 156
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 200
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
92.3586	0.06	100	92.3876
92.3629	0.12	200	92.3877
92.3395	0.18	300	92.3753
92.3229	0.24	400	92.3346
92.3158	0.3	500	92.3378
92.3411	0.36	600	92.3068
92.3362	0.42	700	92.3086
92.3304	0.48	800	92.3751
92.3344	0.55	900	92.3510
92.3355	0.61	1000	92.3283
92.3628	0.67	1100	92.3356
92.337	0.73	1200	92.3693
92.3825	0.79	1300	92.3734
92.3569	0.85	1400	92.2878
92.3633	0.91	1500	92.3738
92.3392	0.97	1600	92.3872

Framework versions

Transformers 4.36.2
Pytorch 2.2.0+cu121
Datasets 2.18.0
Tokenizers 0.15.2