muzerai commited on
Commit
f5f6c15
·
verified ·
1 Parent(s): a285fc9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +133 -3
README.md CHANGED
@@ -1,3 +1,133 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ base_model:
4
+ - kakaocorp/kanana-nano-2.1b-instruct
5
+ ---
6
+ # **Directional Enhancement for Language Models: A Novel Approach to Specialization without Fine-Tuning**
7
+
8
+ ## **Overview**
9
+
10
+ This project presents a methodology for enhancing specific capabilities of language models using the **Directional Enhancement** technique. **This approach does not introduce new knowledge into the model but amplifies its existing latent abilities.** While preserving the general capabilities of the language model, it significantly improves performance in specific domains such as creative writing, education, and technical documentation.
11
+ This is a creative direction enhancement version of [kakaocorp/kanana-nano-2.1b-instruct](https://huggingface.co/kakaocorp/kanana-nano-2.1b-instruct)
12
+
13
+ if enhance.txt are changed by the specific domain, this model style can be changed with that domain type. this test only use 95 instruction for that creative domain field.
14
+
15
+ ## **Technical Background**
16
+
17
+ ### **Principle of Directional Enhancement**
18
+
19
+ This approach identifies a **specialization direction** in the representation space of the language model, associated with a specific capability, and enhances the model’s attention weights in that direction.
20
+
21
+ 1. Compute the difference in representation between **specialized prompts** (domain-specific) and **general prompts** within the model's hidden states.
22
+ 2. Normalize this difference vector to obtain the **specialization direction**.
23
+ 3. Enhance the model’s **self-attention output projection weights (`o_proj`)** along this specialized direction.
24
+
25
+ **This method strengthens the model’s intrinsic abilities rather than introducing completely new knowledge or patterns.** It functions similarly to how a lens amplifies a specific wavelength of light.
26
+
27
+ ### **Computing Specialization Direction**
28
+
29
+ Unlike conventional fine-tuning, which modifies all weights in the model, this approach **identifies a targeted enhancement direction** by analyzing differences in activations across specialized and general inputs.
30
+
31
+ - A set of **specialized** prompts (`enhance.txt`) and **general** prompts (`normal.txt`) are fed into the model.
32
+ - The activations of a **chosen hidden layer** are extracted for both prompt types.
33
+ - The **mean hidden state vector** for specialized prompts is computed and compared to the mean hidden state vector for general prompts.
34
+ - Their difference represents the **specialization direction**, which is then **normalized** to create a unit vector.
35
+
36
+ ### **Enhancing Model Weights**
37
+
38
+ Once the **specialization direction** is computed, it is applied to modify the model’s **self-attention output projection weights (`o_proj`)** in a controlled manner:
39
+
40
+ 1. The specialization direction is **projected** onto the weight matrix of each attention layer.
41
+ 2. A **scaled enhancement factor** is applied to align the model’s attention outputs more strongly with the specialization direction.
42
+ 3. This process **amplifies** the model’s responses in the desired direction without altering its fundamental structure.
43
+
44
+ This targeted adjustment allows the model to **focus more on specific characteristics** (e.g., creativity, technical accuracy, formal tone) while maintaining general competency.
45
+
46
+ ## **Comparison with Existing Methods**
47
+
48
+ | **Method** | **Features** |
49
+ |-----------------------------|-------------|
50
+ | **Traditional Fine-Tuning** | Updates the entire model’s weights, requiring significant computational resources and extensive training data. Enables learning new knowledge and patterns. |
51
+ | **Lightweight Fine-Tuning (LoRA, etc.)** | Adds adaptive low-rank matrices to optimize fine-tuning. More efficient but still requires training. |
52
+ | **Directional Enhancement (this method)** | Selectively **amplifies** the model’s intrinsic capabilities by strengthening specialized output directions. Does not introduce new knowledge. |
53
+
54
+ ## **Implementation Details**
55
+
56
+ ### **Data Preparation**
57
+
58
+ Two types of datasets are used to define the specialization direction:
59
+ - **Specialized Dataset (`enhance.txt`)**: Contains prompts focused on the capability to be enhanced.
60
+ - **General Dataset (`normal.txt`)**: Contains diverse, neutral prompts to serve as a baseline.
61
+
62
+ The difference in activations between these two datasets defines the specialization direction, ensuring that the enhancement is aligned with the target capability while preserving the model’s general functionality.
63
+
64
+ ### **Key Parameters**
65
+
66
+ - **`instructions`**: Number of instruction samples to process (default: 128)
67
+ - **`layer_idx`**: Index of the model layer where specialization direction is computed (default: 60% of total layers)
68
+ - **`enhancement_factor`**: Strength of enhancement along the specialization direction (default: 1.5)
69
+
70
+ ### **Core Algorithm**
71
+
72
+ ```python
73
+ # Compute specialization direction
74
+ specialization_dir = specialized_mean - general_mean
75
+ specialization_dir = specialization_dir / specialization_dir.norm()
76
+
77
+ # Core part of the weight enhancement algorithm
78
+ projection_scalars = torch.matmul(attn_output, specialization_dir)
79
+ projection = torch.outer(projection_scalars, specialization_dir)
80
+ enhanced_weights = attn_output + enhancement_factor * projection
81
+ ```
82
+
83
+ ## **Performance and Results**
84
+
85
+ ### **Improvements in Creative Writing Model**
86
+
87
+ Experiments with creative writing models demonstrate **significant qualitative improvements**:
88
+
89
+ - **Enhanced Descriptive Ability**: More vivid and detailed descriptions with richer sensory language.
90
+ - **Improved Character Development**: Clearer character traits and more distinct personalities.
91
+ - **Enhanced Dialogue Generation**: More natural and engaging conversational exchanges.
92
+ - **Stronger Story Structuring**: Improved narrative flow and coherence.
93
+ - **Increased Emotional Depth**: Greater emotional nuance and expressiveness.
94
+
95
+ ## **Applications**
96
+
97
+ This technique can be applied to various specialized models:
98
+
99
+ - **Creative Writing Models**: Optimized for novel writing, poetry, and storytelling.
100
+ - **Educational Content Models**: Tailored for clear, structured, and pedagogical explanations.
101
+ - **Technical Documentation Models**: Enhanced for structured and precise documentation.
102
+ - **Business Communication Models**: Specialized for professional and formal business writing.
103
+ - **Medical/Scientific Models**: Improved for detailed and accurate scientific explanations.
104
+
105
+ ## **Limitations and Future Improvements**
106
+
107
+ ### **Current Limitations**
108
+
109
+ - **Interpretability of Specialization Directions**: Difficult to precisely determine what specific abilities are being enhanced.
110
+ - **Single-Direction Specialization**: Currently enhances only one specific capability at a time.
111
+ - **Control Over Enhancement Level**: The optimal enhancement factor is determined empirically.
112
+ - **No New Knowledge Acquisition**: Cannot introduce entirely new knowledge beyond what the model already possesses.
113
+ - **Dependence on Existing Abilities**: If the model lacks fundamental knowledge in a domain, the enhancement effects are limited.
114
+
115
+ ### **Future Directions**
116
+
117
+ - **Multi-Directional Enhancement**: Developing techniques to enhance multiple capabilities simultaneously.
118
+ - **Automatic Tuning**: Implementing an automated method for optimal enhancement factor selection.
119
+ - **Interpretability of Specialization**: Researching better semantic analysis of specialization directions.
120
+ - **User-Personalized Specialization**: Customizing specialization directions based on user preferences.
121
+ - **Hybrid Approach**: Combining **directional enhancement** with lightweight fine-tuning to enable both ability enhancement and new knowledge learning.
122
+
123
+ ## **Conclusion**
124
+
125
+ The **Directional Enhancement** technique provides an efficient way to strengthen specific capabilities of language models **without requiring full retraining or additional training data**. While it does not introduce new knowledge, it **amplifies latent abilities** with minimal computational cost. This method offers a practical approach for developing AI models tailored to specialized domains.
126
+
127
+ ## **References & Citations**
128
+
129
+ This methodology was inspired by the following studies:
130
+
131
+ - **Representation Engineering: A Top-Down Approach to AI Alignment**
132
+ - **The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets**
133
+ - **Finding Directions in GAN's Latent Space**