File size: 253 Bytes
aff89a0 |
1 2 3 |
# LoliCore 1B
This is a very small MoE (Mixture Of Expert) model that I will experiment with in different MLP settings. Particularly in this repo I used a Jump module (passing the hidden state directly to the next layer) to test if it will work in MoE. |