Pythia 160M, Mamba 130M, and RWKV 169M models trained on OpenWebText for 4000 steps (context window: 1024; effective batch size: 512). 6 seeds each.
James Michaelov
jmichaelov
AI & ML interests
None yet
Organizations
models
18

jmichaelov/parc-mamba-seed2
0.1B
•
Updated
•
4

jmichaelov/parc-mamba-seed3
0.1B
•
Updated
•
4

jmichaelov/parc-pythia-seed1
0.2B
•
Updated
•
4

jmichaelov/parc-mamba-seed0
0.1B
•
Updated
•
4

jmichaelov/parc-rwkv-seed2
0.2B
•
Updated
•
2

jmichaelov/parc-rwkv-seed3
0.2B
•
Updated
•
1

jmichaelov/parc-pythia-seed2
0.2B
•
Updated
•
5

jmichaelov/parc-mamba-seed1
0.1B
•
Updated
•
5

jmichaelov/parc-mamba-seed4
0.1B
•
Updated
•
4

jmichaelov/parc-rwkv-seed1
0.2B
•
Updated
•
1
datasets
13
jmichaelov/bhs
Viewer
•
Updated
•
22k
•
96
jmichaelov/blimp_nl
Viewer
•
Updated
•
8.4k
•
148
jmichaelov/lm_syneval
Viewer
•
Updated
•
158k
•
62
jmichaelov/inverse_scaling_prize-hindsight_neglect
Viewer
•
Updated
•
315
•
3
jmichaelov/inverse_scaling_prize-memo_trap
Viewer
•
Updated
•
936
•
5
jmichaelov/inverse_scaling_prize-neqa
Viewer
•
Updated
•
300
•
7
jmichaelov/inverse_scaling_prize-redefine
Viewer
•
Updated
•
1.24k
•
2
jmichaelov/inverse_scaling_prize-into_the_unknown
Viewer
•
Updated
•
1.82k
•
2
jmichaelov/inverse_scaling_prize-modus_tollens
Viewer
•
Updated
•
1.24k
•
2
jmichaelov/inverse_scaling_prize-pattern_matching_suppression
Viewer
•
Updated
•
1.43k
•
3