Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ language:
|
|
10 |
|
11 |
The abliteration script ([link](https://github.com/IlyaGusev/saiga/blob/main/scripts/abliterate.py)) is based on code from the blog post and heavily uses [TransformerLens](https://github.com/TransformerLensOrg/TransformerLens). The only major difference from the code used for Llama is [scaling the embedding layer back](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/pretrained/weight_conversions/gemma.py#L13).
|
12 |
|
13 |
-
Orthogonalization **did not** produce the same results as regular interventions. However, the final model still seems to be uncensored.
|
14 |
|
15 |
## Examples:
|
16 |
|
|
|
10 |
|
11 |
The abliteration script ([link](https://github.com/IlyaGusev/saiga/blob/main/scripts/abliterate.py)) is based on code from the blog post and heavily uses [TransformerLens](https://github.com/TransformerLensOrg/TransformerLens). The only major difference from the code used for Llama is [scaling the embedding layer back](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/pretrained/weight_conversions/gemma.py#L13).
|
12 |
|
13 |
+
Orthogonalization **did not** produce the same results as regular interventions since there are RMSNorm layers before merging activations into the residual stream. However, the final model still seems to be uncensored.
|
14 |
|
15 |
## Examples:
|
16 |
|