Push model using huggingface_hub.

Files changed (3) hide show

README.md ADDED Viewed

+---
+language: en
+library_name: mlsae
+license: mit
+tags:
+- arxiv:2409.04185
+- model_hub_mixin
+- pytorch_model_hub_mixin
+---
+# Model Card for tim-lawson/sae-pythia-410m-deduped-x64-k32-tfm-layers-21
+A Multi-Layer Sparse Autoencoder (MLSAE) trained on the residual stream activation
+vectors from [EleutherAI/pythia-410m-deduped](https://huggingface.co/EleutherAI/pythia-410m-deduped) with an
+expansion factor of R = 64 and sparsity k = 32, over 1 billion
+tokens from [monology/pile-uncopyrighted](https://huggingface.co/datasets/monology/pile-uncopyrighted).
+This model is a PyTorch Lightning MLSAETransformer module, which includes the underlying
+transformer.
+### Model Sources
+- **Repository:** <https://github.com/tim-lawson/mlsae>
+- **Paper:** <https://arxiv.org/abs/2409.04185>
+- **Weights & Biases:** <https://wandb.ai/timlawson-/mlsae>
+## Citation
+**BibTeX:**
+```bibtex
+@misc{lawson_residual_2024,
+  title         = {Residual {{Stream Analysis}} with {{Multi-Layer SAEs}}},
+  author        = {Lawson, Tim and Farnik, Lucy and Houghton, Conor and Aitchison, Laurence},
+  year          = {2024},
+  month         = oct,
+  number        = {arXiv:2409.04185},
+  eprint        = {2409.04185},
+  primaryclass  = {cs},
+  publisher     = {arXiv},
+  doi           = {10.48550/arXiv.2409.04185},
+  urldate       = {2024-10-08},
+  archiveprefix = {arXiv}
+}
+```

config.json ADDED Viewed

+{
+  "accumulate_grad_batches": 64,
+  "auxk": 256,
+  "auxk_coef": 0.03125,
+  "batch_size": 1,
+  "dead_steps_threshold": null,
+  "dead_threshold": 0.001,
+  "dead_tokens_threshold": 10000000,
+  "expansion_factor": 64,
+  "k": 32,
+  "layers": [
+    21
+  ],
+  "lr": 0.0001,
+  "max_length": 2048,
+  "model_name": "EleutherAI/pythia-410m-deduped",
+  "skip_special_tokens": true,
+  "standardize": true,
+  "tuned_lens": false
+}

model.safetensors ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:a7b1bf8515b07ebaf27ada1df39a8510adbd0525700dc812b1122583352794b3
+size 2158251016