tim-lawson
/

mlsae-pythia-160m-deduped-x64-k32-tfm

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions Community

tim-lawson commited on Nov 17, 2024

Commit

8439b50

·

verified ·

1 Parent(s): 395a1b6

Push model using huggingface_hub.

Files changed (3) hide show

README.md +5 -18
config.json +4 -3
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -3,23 +3,10 @@ language: en
 library_name: mlsae
 license: mit
 tags:
-  - model_hub_mixin
-  - pytorch_model_hub_mixin
-datasets:
-  - monology/pile-uncopyrighted
 ---
-# mlsae-pythia-160m-deduped-x64-k32-tfm
-A Multi-Layer Sparse Autoencoder (MLSAE) trained on the residual stream
-activation vectors from every layer of
-[EleutherAI/pythia-160m-deduped](https://huggingface.co/EleutherAI/pythia-160m-deduped)
-with an expansion factor of 64 and k = 32, over 1 billion tokens from
-[monology/pile-uncopyrighted](https://huggingface.co/datasets/monology/pile-uncopyrighted).
-This model includes the underlying transformer.
-For more details, see:
-- Paper: <https://arxiv.org/abs/2409.04185>
-- GitHub repository: <https://github.com/tim-lawson/mlsae>
-- Weights & Biases project: <https://wandb.ai/timlawson-/mlsae>

 library_name: mlsae
 license: mit
 tags:
+- model_hub_mixin
+- pytorch_model_hub_mixin
 ---
+This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
+- Library: https://github.com/tim-lawson/mlsae
+- Docs: [More Information Needed]

config.json CHANGED Viewed

@@ -1,6 +1,5 @@
 {
   "accumulate_grad_batches": 64,
-  "autoencoder": null,
   "auxk": 256,
   "auxk_coef": 0.03125,
   "batch_size": 1,
@@ -9,11 +8,13 @@
   "dead_tokens_threshold": 10000000,
   "expansion_factor": 64,
   "k": 32,
-  "layers": null,
   "lr": 0.0001,
   "max_length": 2048,
   "model_name": "EleutherAI/pythia-160m-deduped",
   "skip_special_tokens": true,
   "standardize": true,
-  "transformer": null
 }

 {
   "accumulate_grad_batches": 64,
   "auxk": 256,
   "auxk_coef": 0.03125,
   "batch_size": 1,
   "dead_tokens_threshold": 10000000,
   "expansion_factor": 64,
   "k": 32,
+  "layers": [
+    0
+  ],
   "lr": 0.0001,
   "max_length": 2048,
   "model_name": "EleutherAI/pythia-160m-deduped",
   "skip_special_tokens": true,
   "standardize": true,
+  "tuned_lens": false
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:652833f14442dcd780c647324391df1f54b50e63d044ce7ff9a70d12ddaf14e8
 size 951304624

 version https://git-lfs.github.com/spec/v1
+oid sha256:ea3db6cdc86c4175f155f5381bec1bf8d2d1be353c81159126565001c1accd74
 size 951304624