RylanSchaeffer
/

collapse_gemma-2-27b_hs2_replace_iter2_sftsd1

Generated from Trainer

Model card Files Files and versions Community

collapse_gemma-2-27b_hs2_replace_iter2_sftsd1

This model is a fine-tuned version of google/gemma-2-27b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1843
Num Input Tokens Seen: 3808768

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 4
eval_batch_size: 16
seed: 1
gradient_accumulation_steps: 32
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.1282	0
2.5155	0.0608	5	1.0474	236020
2.3221	0.1216	10	1.0643	471208
1.8872	0.1824	15	1.0913	707824
1.5782	0.2432	20	1.1446	943300
1.3696	0.3040	25	1.1695	1175352
1.1143	0.3647	30	1.1811	1412980
1.1623	0.4255	35	1.1684	1635940
1.235	0.4863	40	1.1777	1866292
1.1213	0.5471	45	1.1692	2096140
1.125	0.6079	50	1.1775	2327208
1.0627	0.6687	55	1.1690	2561988
0.9847	0.7295	60	1.1899	2784360
1.0474	0.7903	65	1.1640	3017608
0.9585	0.8511	70	1.1777	3249220
0.9715	0.9119	75	1.1700	3485472
1.0036	0.9726	80	1.1909	3722208

Framework versions

Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 11

Safetensors

Model size

27.2B params

Tensor type

BF16

·

Inference API

Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-27b_hs2_replace_iter2_sftsd1

Base model

google/gemma-2-27b

Finetuned

(33)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard