stefanhex-apollo commited on
Commit
621ed53
·
verified ·
1 Parent(s): 4c461d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -4
README.md CHANGED
@@ -8,6 +8,11 @@ tags: []
8
  This is a gpt2-small model with LayerNorm fine-tuned out.
9
 
10
  The model was fine-tuned on OpenWebText for ~500M tokens (1000 iterations of batch size ~488 at 1024 context length) while gradually disableing LayerNorm layers.
 
 
 
 
 
11
 
12
  The model is a `GPT2LMHeadModel` (to avoid requiring `trust_remote_code`) which technically contains LayerNorm blocks.
13
  However, the epsilon values are all set to 1e12 so that the LayerNorm has no effect. The LN scale is set to 1e6 (to counter the 1e12 epsilon), and the bias to 0.
@@ -15,8 +20,80 @@ The final LayerNorm also has 1e12 as epsilon, but non-unity weights and biases.
15
  thus the LN parameters cannot be folded into that matrix. You can completely remove all LNs by simply replacing `ln_1` and `ln_2` modules with identities, and replacing
16
  `ln_f` with modifications to the unembed matrix and unembed bias.
17
 
18
- Available versions:
19
- * v2 (default): Trained for 1000 iterations in a single training run
20
- * v1: Trained for 900 iterations, with multiple interrup, modify LNs, and resume steps
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
- The training script will be published shortly.
 
 
 
8
  This is a gpt2-small model with LayerNorm fine-tuned out.
9
 
10
  The model was fine-tuned on OpenWebText for ~500M tokens (1000 iterations of batch size ~488 at 1024 context length) while gradually disableing LayerNorm layers.
11
+ For details see [here](https://www.lesswrong.com/posts/THzcKKQd4oWkg4dSP/you-can-remove-gpt2-s-layernorm-by-fine-tuning-for-an-hour) and the upcoming paper.
12
+
13
+ Available versions:
14
+ * v2 (default): Trained for 1000 iterations in a single training run
15
+ * v1: Trained for 900 iterations, with multiple interrup, modify LNs, and resume steps
16
 
17
  The model is a `GPT2LMHeadModel` (to avoid requiring `trust_remote_code`) which technically contains LayerNorm blocks.
18
  However, the epsilon values are all set to 1e12 so that the LayerNorm has no effect. The LN scale is set to 1e6 (to counter the 1e12 epsilon), and the bias to 0.
 
20
  thus the LN parameters cannot be folded into that matrix. You can completely remove all LNs by simply replacing `ln_1` and `ln_2` modules with identities, and replacing
21
  `ln_f` with modifications to the unembed matrix and unembed bias.
22
 
23
+ ## TransformerLens loading code
24
+ ```python
25
+ import torch
26
+ from transformers import GPT2LMHeadModel
27
+ from transformer_lens import HookedTransformer
28
+
29
+ model = GPT2LMHeadModel.from_pretrained("apollo-research/gpt2_noLN").to("cpu")
30
+ hooked_model = HookedTransformer.from_pretrained("gpt2", hf_model=model, fold_ln=False, center_unembed=False).to("cpu")
31
+ # Kill the LayerNorms because TransformerLens overwrites eps
32
+ for block in hooked_model.blocks:
33
+ block.ln1.eps = 1e12
34
+ block.ln2.eps = 1e12
35
+ hooked_model.ln_final.eps = 1e12
36
+ ```
37
+
38
+ Or with LNs properly replaced by identities:
39
+ ```python
40
+ import torch
41
+ from transformers import GPT2LMHeadModel
42
+ from transformer_lens import HookedTransformer
43
+
44
+ model = GPT2LMHeadModel.from_pretrained("apollo-research/gpt2_noLN").to("cpu")
45
+
46
+ # Undo my hacky LayerNorm removal
47
+ for block in model.transformer.h:
48
+ block.ln_1.weight.data = block.ln_1.weight.data / 1e6
49
+ block.ln_1.eps = 1e-5
50
+ block.ln_2.weight.data = block.ln_2.weight.data / 1e6
51
+ block.ln_2.eps = 1e-5
52
+ model.transformer.ln_f.weight.data = model.transformer.ln_f.weight.data / 1e6
53
+ model.transformer.ln_f.eps = 1e-5
54
+
55
+ # Properly replace LayerNorms by Identities
56
+ class HookedTransformerNoLN(HookedTransformer):
57
+ def removeLN(self):
58
+ for i in range(len(self.blocks)):
59
+ self.blocks[i].ln1 = torch.nn.Identity()
60
+ self.blocks[i].ln2 = torch.nn.Identity()
61
+ self.ln_final = torch.nn.Identity()
62
+
63
+ hooked_model = HookedTransformerNoLN.from_pretrained("gpt2", hf_model=model, fold_ln=True, center_unembed=False).to("cpu")
64
+ hooked_model.removeLN()
65
+ ```
66
+
67
+ ## NNSight loading code
68
+ Copy-pasted from [Logan Riggs' comment](https://www.lesswrong.com/posts/THzcKKQd4oWkg4dSP/you-can-remove-gpt2-s-layernorm-by-fine-tuning-for-an-hour?commentId=Gcq8wic9WmdnqM2Fm), based on code by Caden.
69
+ ```python
70
+ import torch
71
+ from transformers import GPT2LMHeadModel
72
+ from transformer_lens import HookedTransformer
73
+ from nnsight.models.UnifiedTransformer import UnifiedTransformer
74
+
75
+
76
+ model = GPT2LMHeadModel.from_pretrained("apollo-research/gpt2_noLN").to("cpu")
77
+
78
+ # Undo my hacky LayerNorm removal
79
+ for block in model.transformer.h:
80
+ block.ln_1.weight.data = block.ln_1.weight.data / 1e6
81
+ block.ln_1.eps = 1e-5
82
+ block.ln_2.weight.data = block.ln_2.weight.data / 1e6
83
+ block.ln_2.eps = 1e-5
84
+ model.transformer.ln_f.weight.data = model.transformer.ln_f.weight.data / 1e6
85
+ model.transformer.ln_f.eps = 1e-5
86
+
87
+ # Properly replace LayerNorms by Identities
88
+ def removeLN(transformer_lens_model):
89
+ for i in range(len(transformer_lens_model.blocks)):
90
+ transformer_lens_model.blocks[i].ln1 = torch.nn.Identity()
91
+ transformer_lens_model.blocks[i].ln2 = torch.nn.Identity()
92
+ transformer_lens_model.ln_final = torch.nn.Identity()
93
+
94
+ hooked_model = HookedTransformer.from_pretrained("gpt2", hf_model=model, fold_ln=True, center_unembed=False).to("cpu")
95
+ removeLN(hooked_model)
96
 
97
+ model_nnsight = UnifiedTransformer(model="gpt2", hf_model=model, fold_ln=True, center_unembed=False).to("cpu")
98
+ removeLN(model_nnsight)
99
+ ```