namgoodfire
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -15,8 +15,8 @@ tags:
|
|
15 |
|
16 |
The Goodfire SAE (Sparse Autoencoder) for [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
|
17 |
is an interpreter model designed to analyze and understand
|
18 |
-
the model's internal representations. This SAE model is trained specifically on layer
|
19 |
-
Llama 3.
|
20 |
into interpretable features. The model is optimized for interpretability tasks and model steering applications,
|
21 |
allowing researchers and developers to gain insights into the model's internal processing and behavior patterns.
|
22 |
As an open-source tool, it serves as a foundation for advancing interpretability research and enhancing control
|
|
|
15 |
|
16 |
The Goodfire SAE (Sparse Autoencoder) for [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
|
17 |
is an interpreter model designed to analyze and understand
|
18 |
+
the model's internal representations. This SAE model is trained specifically on layer 19 of
|
19 |
+
Llama 3.1 8B and achieves an L0 count of 91, enabling the decomposition of complex neural activations
|
20 |
into interpretable features. The model is optimized for interpretability tasks and model steering applications,
|
21 |
allowing researchers and developers to gain insights into the model's internal processing and behavior patterns.
|
22 |
As an open-source tool, it serves as a foundation for advancing interpretability research and enhancing control
|