Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,133 @@
|
|
1 |
-
---
|
2 |
-
license: cc-by-nc-4.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-nc-4.0
|
3 |
+
base_model:
|
4 |
+
- kakaocorp/kanana-nano-2.1b-instruct
|
5 |
+
---
|
6 |
+
# **Directional Enhancement for Language Models: A Novel Approach to Specialization without Fine-Tuning**
|
7 |
+
|
8 |
+
## **Overview**
|
9 |
+
|
10 |
+
This project presents a methodology for enhancing specific capabilities of language models using the **Directional Enhancement** technique. **This approach does not introduce new knowledge into the model but amplifies its existing latent abilities.** While preserving the general capabilities of the language model, it significantly improves performance in specific domains such as creative writing, education, and technical documentation.
|
11 |
+
This is a creative direction enhancement version of [kakaocorp/kanana-nano-2.1b-instruct](https://huggingface.co/kakaocorp/kanana-nano-2.1b-instruct)
|
12 |
+
|
13 |
+
if enhance.txt are changed by the specific domain, this model style can be changed with that domain type. this test only use 95 instruction for that creative domain field.
|
14 |
+
|
15 |
+
## **Technical Background**
|
16 |
+
|
17 |
+
### **Principle of Directional Enhancement**
|
18 |
+
|
19 |
+
This approach identifies a **specialization direction** in the representation space of the language model, associated with a specific capability, and enhances the model’s attention weights in that direction.
|
20 |
+
|
21 |
+
1. Compute the difference in representation between **specialized prompts** (domain-specific) and **general prompts** within the model's hidden states.
|
22 |
+
2. Normalize this difference vector to obtain the **specialization direction**.
|
23 |
+
3. Enhance the model’s **self-attention output projection weights (`o_proj`)** along this specialized direction.
|
24 |
+
|
25 |
+
**This method strengthens the model’s intrinsic abilities rather than introducing completely new knowledge or patterns.** It functions similarly to how a lens amplifies a specific wavelength of light.
|
26 |
+
|
27 |
+
### **Computing Specialization Direction**
|
28 |
+
|
29 |
+
Unlike conventional fine-tuning, which modifies all weights in the model, this approach **identifies a targeted enhancement direction** by analyzing differences in activations across specialized and general inputs.
|
30 |
+
|
31 |
+
- A set of **specialized** prompts (`enhance.txt`) and **general** prompts (`normal.txt`) are fed into the model.
|
32 |
+
- The activations of a **chosen hidden layer** are extracted for both prompt types.
|
33 |
+
- The **mean hidden state vector** for specialized prompts is computed and compared to the mean hidden state vector for general prompts.
|
34 |
+
- Their difference represents the **specialization direction**, which is then **normalized** to create a unit vector.
|
35 |
+
|
36 |
+
### **Enhancing Model Weights**
|
37 |
+
|
38 |
+
Once the **specialization direction** is computed, it is applied to modify the model’s **self-attention output projection weights (`o_proj`)** in a controlled manner:
|
39 |
+
|
40 |
+
1. The specialization direction is **projected** onto the weight matrix of each attention layer.
|
41 |
+
2. A **scaled enhancement factor** is applied to align the model’s attention outputs more strongly with the specialization direction.
|
42 |
+
3. This process **amplifies** the model’s responses in the desired direction without altering its fundamental structure.
|
43 |
+
|
44 |
+
This targeted adjustment allows the model to **focus more on specific characteristics** (e.g., creativity, technical accuracy, formal tone) while maintaining general competency.
|
45 |
+
|
46 |
+
## **Comparison with Existing Methods**
|
47 |
+
|
48 |
+
| **Method** | **Features** |
|
49 |
+
|-----------------------------|-------------|
|
50 |
+
| **Traditional Fine-Tuning** | Updates the entire model’s weights, requiring significant computational resources and extensive training data. Enables learning new knowledge and patterns. |
|
51 |
+
| **Lightweight Fine-Tuning (LoRA, etc.)** | Adds adaptive low-rank matrices to optimize fine-tuning. More efficient but still requires training. |
|
52 |
+
| **Directional Enhancement (this method)** | Selectively **amplifies** the model’s intrinsic capabilities by strengthening specialized output directions. Does not introduce new knowledge. |
|
53 |
+
|
54 |
+
## **Implementation Details**
|
55 |
+
|
56 |
+
### **Data Preparation**
|
57 |
+
|
58 |
+
Two types of datasets are used to define the specialization direction:
|
59 |
+
- **Specialized Dataset (`enhance.txt`)**: Contains prompts focused on the capability to be enhanced.
|
60 |
+
- **General Dataset (`normal.txt`)**: Contains diverse, neutral prompts to serve as a baseline.
|
61 |
+
|
62 |
+
The difference in activations between these two datasets defines the specialization direction, ensuring that the enhancement is aligned with the target capability while preserving the model’s general functionality.
|
63 |
+
|
64 |
+
### **Key Parameters**
|
65 |
+
|
66 |
+
- **`instructions`**: Number of instruction samples to process (default: 128)
|
67 |
+
- **`layer_idx`**: Index of the model layer where specialization direction is computed (default: 60% of total layers)
|
68 |
+
- **`enhancement_factor`**: Strength of enhancement along the specialization direction (default: 1.5)
|
69 |
+
|
70 |
+
### **Core Algorithm**
|
71 |
+
|
72 |
+
```python
|
73 |
+
# Compute specialization direction
|
74 |
+
specialization_dir = specialized_mean - general_mean
|
75 |
+
specialization_dir = specialization_dir / specialization_dir.norm()
|
76 |
+
|
77 |
+
# Core part of the weight enhancement algorithm
|
78 |
+
projection_scalars = torch.matmul(attn_output, specialization_dir)
|
79 |
+
projection = torch.outer(projection_scalars, specialization_dir)
|
80 |
+
enhanced_weights = attn_output + enhancement_factor * projection
|
81 |
+
```
|
82 |
+
|
83 |
+
## **Performance and Results**
|
84 |
+
|
85 |
+
### **Improvements in Creative Writing Model**
|
86 |
+
|
87 |
+
Experiments with creative writing models demonstrate **significant qualitative improvements**:
|
88 |
+
|
89 |
+
- **Enhanced Descriptive Ability**: More vivid and detailed descriptions with richer sensory language.
|
90 |
+
- **Improved Character Development**: Clearer character traits and more distinct personalities.
|
91 |
+
- **Enhanced Dialogue Generation**: More natural and engaging conversational exchanges.
|
92 |
+
- **Stronger Story Structuring**: Improved narrative flow and coherence.
|
93 |
+
- **Increased Emotional Depth**: Greater emotional nuance and expressiveness.
|
94 |
+
|
95 |
+
## **Applications**
|
96 |
+
|
97 |
+
This technique can be applied to various specialized models:
|
98 |
+
|
99 |
+
- **Creative Writing Models**: Optimized for novel writing, poetry, and storytelling.
|
100 |
+
- **Educational Content Models**: Tailored for clear, structured, and pedagogical explanations.
|
101 |
+
- **Technical Documentation Models**: Enhanced for structured and precise documentation.
|
102 |
+
- **Business Communication Models**: Specialized for professional and formal business writing.
|
103 |
+
- **Medical/Scientific Models**: Improved for detailed and accurate scientific explanations.
|
104 |
+
|
105 |
+
## **Limitations and Future Improvements**
|
106 |
+
|
107 |
+
### **Current Limitations**
|
108 |
+
|
109 |
+
- **Interpretability of Specialization Directions**: Difficult to precisely determine what specific abilities are being enhanced.
|
110 |
+
- **Single-Direction Specialization**: Currently enhances only one specific capability at a time.
|
111 |
+
- **Control Over Enhancement Level**: The optimal enhancement factor is determined empirically.
|
112 |
+
- **No New Knowledge Acquisition**: Cannot introduce entirely new knowledge beyond what the model already possesses.
|
113 |
+
- **Dependence on Existing Abilities**: If the model lacks fundamental knowledge in a domain, the enhancement effects are limited.
|
114 |
+
|
115 |
+
### **Future Directions**
|
116 |
+
|
117 |
+
- **Multi-Directional Enhancement**: Developing techniques to enhance multiple capabilities simultaneously.
|
118 |
+
- **Automatic Tuning**: Implementing an automated method for optimal enhancement factor selection.
|
119 |
+
- **Interpretability of Specialization**: Researching better semantic analysis of specialization directions.
|
120 |
+
- **User-Personalized Specialization**: Customizing specialization directions based on user preferences.
|
121 |
+
- **Hybrid Approach**: Combining **directional enhancement** with lightweight fine-tuning to enable both ability enhancement and new knowledge learning.
|
122 |
+
|
123 |
+
## **Conclusion**
|
124 |
+
|
125 |
+
The **Directional Enhancement** technique provides an efficient way to strengthen specific capabilities of language models **without requiring full retraining or additional training data**. While it does not introduce new knowledge, it **amplifies latent abilities** with minimal computational cost. This method offers a practical approach for developing AI models tailored to specialized domains.
|
126 |
+
|
127 |
+
## **References & Citations**
|
128 |
+
|
129 |
+
This methodology was inspired by the following studies:
|
130 |
+
|
131 |
+
- **Representation Engineering: A Top-Down Approach to AI Alignment**
|
132 |
+
- **The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets**
|
133 |
+
- **Finding Directions in GAN's Latent Space**
|