Nhut
/

Llama3-20240602

Generated from Trainer

Model card Files Files and versions Community

Nhut commited on Jun 2, 2024

Commit

94c60b3

·

verified ·

1 Parent(s): 460a6af

Model save

Files changed (2) hide show

README.md +88 -0
adapter_model.safetensors +1 -1

README.md ADDED Viewed

	@@ -0,0 +1,88 @@

+---
+license: llama3
+library_name: peft
+tags:
+- trl
+- sft
+- generated_from_trainer
+base_model: meta-llama/Meta-Llama-3-8B-Instruct
+datasets:
+- generator
+model-index:
+- name: Llama3-20240602
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# Llama3-20240602
+This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the generator dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.4100
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: constant
+- lr_scheduler_warmup_steps: 0.03
+- training_steps: 960
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| No log        | 0.1356 | 40   | 1.3411          |
+| No log        | 0.2712 | 80   | 1.3121          |
+| 1.335         | 0.4068 | 120  | 1.2957          |
+| 1.335         | 0.5424 | 160  | 1.2854          |
+| 1.258         | 0.6780 | 200  | 1.2772          |
+| 1.258         | 0.8136 | 240  | 1.2706          |
+| 1.258         | 0.9492 | 280  | 1.2642          |
+| 1.2379        | 1.0847 | 320  | 1.2746          |
+| 1.2379        | 1.2203 | 360  | 1.2682          |
+| 1.1301        | 1.3559 | 400  | 1.2697          |
+| 1.1301        | 1.4915 | 440  | 1.2713          |
+| 1.1301        | 1.6271 | 480  | 1.2671          |
+| 1.1256        | 1.7627 | 520  | 1.2633          |
+| 1.1256        | 1.8983 | 560  | 1.2620          |
+| 1.0987        | 2.0339 | 600  | 1.2888          |
+| 1.0987        | 2.1695 | 640  | 1.3127          |
+| 1.0987        | 2.3051 | 680  | 1.3148          |
+| 0.9445        | 2.4407 | 720  | 1.3093          |
+| 0.9445        | 2.5763 | 760  | 1.3086          |
+| 0.9553        | 2.7119 | 800  | 1.3095          |
+| 0.9553        | 2.8475 | 840  | 1.3029          |
+| 0.9553        | 2.9831 | 880  | 1.3066          |
+| 0.9298        | 3.1186 | 920  | 1.4147          |
+| 0.9298        | 3.2542 | 960  | 1.4100          |
+### Framework versions
+- PEFT 0.11.1
+- Transformers 4.41.2
+- Pytorch 2.3.0+cu121
+- Datasets 2.19.1
+- Tokenizers 0.19.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:73103cb1dfe9d451f24c2de84e4bd99f99b445dc332dc90c671cb48227d0e99f
 size 2806378968

 version https://git-lfs.github.com/spec/v1
+oid sha256:010b822cff637325f3a78c3c5b8a09c0602ed06b9890101b864fb9b3c56fa154
 size 2806378968