jquesada commited on
Commit
6068948
·
1 Parent(s): 9ffeb20

Card update

Browse files
Files changed (1) hide show
  1. README.md +76 -0
README.md CHANGED
@@ -1,3 +1,79 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # Model Card for Model ID
6
+
7
+ This model is a finetuning of other models based on mistralai/Mistral-7B-v0.1.
8
+
9
+ ## Model Details
10
+
11
+ ### Model Description
12
+
13
+ The model has been generated from the merging of the models [viethq188/LeoScorpius-7B-Chat-DPO](https://huggingface.co/viethq188/LeoScorpius-7B-Chat-DPO) and [GreenNode/GreenNodeLM-7B-v1olet](https://huggingface.co/GreenNode/GreenNodeLM-7B-v1olet) and a later finetuning with a Platypus dataset [garage-bAInd/Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus).
14
+
15
+ - **Developed by:** Ignos
16
+ - **Model type:** Mistral
17
+ - **License:** Apache-2.0
18
+
19
+ ## Uses
20
+
21
+ The model aims to have good overall comparative results on HuggingFace metrics, improving reasoning.
22
+
23
+ ## Bias, Risks, and Limitations
24
+
25
+ The same bias, risks and limitations from base models.
26
+
27
+ ## Training Details
28
+
29
+ ### Training Data
30
+
31
+ - [garage-bAInd/Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus)
32
+
33
+ ### Training Procedure
34
+
35
+ - Training with QLoRA approach and merging with base model.
36
+
37
+ ### Results
38
+
39
+ - Huggingface evaluation pending
40
+
41
+ #### Summary
42
+
43
+ ## Technical Specifications
44
+
45
+ ### Model Architecture and Objective
46
+
47
+ - Models based on Mistral Architecture
48
+
49
+ ### Compute Infrastructure
50
+
51
+ - Training on RunPod
52
+
53
+ #### Hardware
54
+
55
+ - 4 x Nvidia RTX 4090
56
+ - 64 vCPU 503 GB RAM
57
+
58
+ #### Software
59
+
60
+ - Mergekit (main)
61
+ - Axolotl 0.3.0
62
+
63
+ ## Training procedure
64
+
65
+ The following `bitsandbytes` quantization config was used during training:
66
+ - quant_method: bitsandbytes
67
+ - load_in_8bit: False
68
+ - load_in_4bit: True
69
+ - llm_int8_threshold: 6.0
70
+ - llm_int8_skip_modules: None
71
+ - llm_int8_enable_fp32_cpu_offload: False
72
+ - llm_int8_has_fp16_weight: False
73
+ - bnb_4bit_quant_type: nf4
74
+ - bnb_4bit_use_double_quant: True
75
+ - bnb_4bit_compute_dtype: bfloat16
76
+
77
+ ### Framework versions
78
+
79
+ - PEFT 0.6.0