kevin009 commited on
Commit
d907b8e
·
verified ·
1 Parent(s): 55195ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -13
README.md CHANGED
@@ -1,22 +1,114 @@
1
  ---
2
- base_model: unsloth/meta-llama-3.1-8b-instruct
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - llama
8
- - trl
9
  license: apache-2.0
10
  language:
11
  - en
 
 
 
 
 
 
 
 
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** kevin009
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/meta-llama-3.1-8b-instruct
19
 
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
1
  ---
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  language:
4
  - en
5
+ base_model:
6
+ - meta-llama/Llama-3.1-8B-instruct
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - lora
10
+ - adapter
11
+ - writing
12
+ - CoT
13
  ---
14
+ # Merged-Llama-Adapters-317-320
15
+
16
+ A merged LoRA adapter combining four fine-tuned adapters (317-320) for the Llama-3.1-8B language model.
17
+
18
+ ## Model Details
19
+
20
+ - Base Model: meta-llama/Llama-3.1-8B-instruct
21
+ - Adaptation Method: Merged LoRA
22
+
23
+ ## Merger Configuration
24
+
25
+ ### Source Adapters
26
+
27
+ All source adapters share the following configuration:
28
+ - Rank (r): 16
29
+ - Alpha: 16
30
+ - Target Modules:
31
+ - q_proj (Query projection)
32
+ - k_proj (Key projection)
33
+ - v_proj (Value projection)
34
+ - o_proj (Output projection)
35
+ - up_proj (Upsampling projection)
36
+ - down_proj (Downsampling projection)
37
+ - gate_proj (Gate projection)
38
+
39
+ ### Merger Details
40
+
41
+ - Merger Method: Linear interpolation
42
+ - Merger Weights: Equal weights (0.25) for each adapter
43
+ - Combined Rank: 16 (maintained from source adapters)
44
+
45
+ ## Usage
46
+
47
+ This merged adapter must be used with the base Llama-3.1-8B-instruct model.
48
+
49
+ ## Limitations and Biases
50
+
51
+ - This merged adapter inherits limitations and biases from:
52
+ - The base Llama-3.1-8B-instruct model
53
+ - More baises from traning data as most of them were fiction work.
54
+ - The merging process may result in:
55
+ - Potential loss of specialized capabilities from individual adapters
56
+ - Averaged behavior across different adapter specializations
57
+ - Possible interference between adapter weights
58
+
59
+ ## Merging Process
60
+
61
+ The adapters were merged using the following approach:
62
+ 1. Linear interpolation of adapter weights
63
+ 2. Equal weighting (0.25) applied to each source adapter
64
+ 3. Preservation of original LoRA rank and architecture
65
+
66
+ ### Method Used
67
+
68
+ The adapters were merged using PEFT (Parameter-Efficient Fine-Tuning) library's weighted adapter combination feature. The process combines multiple LoRA adapters using linear interpolation with specified weights.
69
+
70
+
71
+ ### Key Parameters
72
+
73
+ - `combination_type="ties"`: Uses the TIES (Task Interference Edge Selection) method for combining adapters
74
+ - `density=0.2`: Controls the sparsity of the merged weights
75
+
76
+
77
+ ### Notes
78
+
79
+ - The order of loading adapters may affect the final result
80
+ - Equal weights were chosen to maintain balanced influence from each adapter
81
+ - The merged adapter maintains the same architecture and rank as the original adapters
82
+ - While this adapter merges multiple fine-tunes, each component was developed as part of independent research efforts to explore and language model capabilities as part of R&D process.
83
+
84
+
85
+ ## Datasets
86
+
87
+ - Not yet released, but should be released after evaluation has completed.
88
+ - Only 1k pairs example of revision task <input_text> + <style_guide> => <thinking> <-> </revised_text>
89
+
90
+ ### Use Cases
91
+
92
+ - This merged adapter can be used for a wide range of tasks, including but not limited to:
93
+ - Accessibility
94
+ - Revision & Editing
95
+ - instruction-following use with xml tags
96
+ - Thinking & reasoning with xml tag of <thinking> and </thinking>, if being asked i the instructions.
97
+
98
+
99
+ These Models not optimized for code, math, or other specialized tasks that need Perefence Optimization.
100
+
101
+ ## Why SFT Instead of RLHF/DPO?
102
+ - RLHF and DPO approaches often lead to vocabulary limitations and overfitting due to their optimization objectives
103
 
104
+ ## License
105
 
106
+ Licensed under Apache 2.0 License.
 
 
107
 
108
+ This merged adapter is part of independent individual research work. While the code is open-source under the Apache 2.0 license, please note:
109
 
110
+ - You are free to use, modify, and distribute this adapter following the Apache 2.0 license terms
111
+ - This work is provided "as is" without warranties or conditions of any kind
112
+ - This is an independent research project and not affiliated with any organization
113
+ - Attribution is appreciated but not required
114
+ - For full license details, see: https://www.apache.org/licenses/LICENSE-2.0