Arsh LLM

I introduce my first model, Arsh LLM. Arsh LLM as what I named it is created to just generate human like outputs. We use MOE: mixture of experts to use a very low memory especially useful for bigger models. Arsh LLM is not an expert model in anything cause the datas to train this model weren't as large as what big model used to train their models. Activated params: 662m

Technical information

First, we created a big tokenizer. then for pretrain we used gpt3_xl, gpt neo, and pile open source codes to train the model. the post train became completed by training on a private & powerful private dataset.

Everything is open source

The model weights are completely downloadable. just check out my github account. You are free to use the idea, model and everything you want. Arsh LLM is almost a research project!

License

The model is licensed under MIT. but you are not prohibited to do not mention the name of Arsh LLM if you use it on your model!