File size: 998 Bytes
970d3b8
 
 
2e2488d
 
 
 
 
 
 
 
 
 
 
 
 
 
970d3b8
 
 
 
 
0127919
970d3b8
 
 
 
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
970d3b8
2e2488d
 
 
 
 
970d3b8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
library_name: transformers
tags:
- PEFT
- mistral
- sft
- 'TensorBoard '
- Safetensors
- ' trl'
- generated_from_trainer 4-bit
- ' precision'
license: mit
datasets:
- yahma/alpaca-cleaned
language:
- en
pipeline_tag: question-answering
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->
This Model is Finetuned for Document Question and Answering purpose Trained on the yahma/alpaca-cleaned(https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) dataset.


## Model Details

### Training hyperparameters

The following hyperparameters were used during training:

-gradient_accumulation_steps=1,

-warmup_steps=5,

-max_steps=20,

-learning_rate=2e-4,

-fp16=not torch.cuda.is_bf16_supported(),

-bf16=torch.cuda.is_bf16_supported(),

-logging_steps=1,

-optim="adamw_8bit",

-weight_decay=0.01,

-lr_scheduler_type="linear",

-seed=3407,

- ### Framework versions

- PEFT 0.7.1
- Transformers 4.36.0
- Pytorch 2.0.0
- Datasets 2.16.1
- Tokenizers 0.15.0