habanoz
/

TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

habanoz commited on Nov 28, 2023

Commit

152436a

•

1 Parent(s): 668aae2

Create README.md

Files changed (1) hide show

README.md +34 -0

README.md ADDED Viewed

	@@ -0,0 +1,34 @@

+---
+license: apache-2.0
+datasets:
+- databricks/databricks-dolly-15k
+language:
+- en
+pipeline_tag: text-generation
+base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-955k-token-2T
+---
+TinyLlama/TinyLlama-1.1B-intermediate-step-955k-token-2T finetuned using dolly dataset.
+Training took 1 hour on an 'ml.g5.xlarge' instance.
+```python
+hyperparameters ={
+  'num_train_epochs': 3,                            # number of training epochs
+  'per_device_train_batch_size': 6,                 # batch size for training
+  'gradient_accumulation_steps': 2,                 # Number of updates steps to accumulate
+  'gradient_checkpointing': True,                   # save memory but slower backward pass
+  'bf16': True,                                     # use bfloat16 precision
+  'tf32': True,                                     # use tf32 precision
+  'learning_rate': 2e-4,                            # learning rate
+  'max_grad_norm': 0.3,                             # Maximum norm (for gradient clipping)
+  'warmup_ratio': 0.03,                             # warmup ratio
+  "lr_scheduler_type":"constant",                   # learning rate scheduler
+  'save_strategy': "epoch",                         # save strategy for checkpoints
+  "logging_steps": 10,                              # log every x steps
+  'merge_adapters': True,                           # wether to merge LoRA into the model (needs more memory)
+  'use_flash_attn': True,                           # Whether to use Flash Attention
+}
+```