File size: 758 Bytes
2956eb9 962ab56 2956eb9 962ab56 2956eb9 962ab56 2956eb9 962ab56 2956eb9 962ab56 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
---
library_name: transformers
tags: []
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
llama3-8B supervised finetuning with llama-adapter 4bit quantization
## Model Details
adapter_layers:30
adapter_len:10
gamma:0.85
batch_size_training:4
gradient_accumulation_steps:4
lr:0.0001
num_epochs:3
num_freeze_layers:1
optimizer:"AdamW"
peft_method:"llama_adapter"
trainable params: 1,228,830 || all params: 8,031,490,078 || trainable%: 0.0153
### Model Description
Average epoch time: 566s
Train loss: 0.41620415449142456
Eval loss: 1.57061767578125
Max CUDA memory allocated was 14 GB
Max CUDA memory reserved was 16 GB
Peak active CUDA memory was 14 GB
CPU Total Peak Memory consumed during the train (max): 4 GB |