File size: 998 Bytes
970d3b8 2e2488d 970d3b8 0127919 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 2e2488d 970d3b8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
library_name: transformers
tags:
- PEFT
- mistral
- sft
- 'TensorBoard '
- Safetensors
- ' trl'
- generated_from_trainer 4-bit
- ' precision'
license: mit
datasets:
- yahma/alpaca-cleaned
language:
- en
pipeline_tag: question-answering
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This Model is Finetuned for Document Question and Answering purpose Trained on the yahma/alpaca-cleaned(https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) dataset.
## Model Details
### Training hyperparameters
The following hyperparameters were used during training:
-gradient_accumulation_steps=1,
-warmup_steps=5,
-max_steps=20,
-learning_rate=2e-4,
-fp16=not torch.cuda.is_bf16_supported(),
-bf16=torch.cuda.is_bf16_supported(),
-logging_steps=1,
-optim="adamw_8bit",
-weight_decay=0.01,
-lr_scheduler_type="linear",
-seed=3407,
- ### Framework versions
- PEFT 0.7.1
- Transformers 4.36.0
- Pytorch 2.0.0
- Datasets 2.16.1
- Tokenizers 0.15.0
|