huihui-ai commited on
Commit
b875760
·
verified ·
1 Parent(s): 8b579d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -16,6 +16,7 @@ tags:
16
  ## Model Overview
17
  Huihui-MoE-1B-A0.6B is a **Mixture of Experts (MoE)** language model developed by **huihui.ai**, built upon the **[Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)** base model. It enhances the standard Transformer architecture by replacing MLP layers with MoE layers, each containing 3 experts, to achieve high performance with efficient inference. The model is designed for natural language processing tasks, including text generation, question answering, and conversational applications.
18
 
 
19
 
20
  - **Architecture**: Qwen3MoeForCausalLM model with 3 experts per layer (num_experts=3), activating 1 expert per token (num_experts_per_tok=1).
21
  - **Total Parameters**: ~1.1 billion (1B)
 
16
  ## Model Overview
17
  Huihui-MoE-1B-A0.6B is a **Mixture of Experts (MoE)** language model developed by **huihui.ai**, built upon the **[Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)** base model. It enhances the standard Transformer architecture by replacing MLP layers with MoE layers, each containing 3 experts, to achieve high performance with efficient inference. The model is designed for natural language processing tasks, including text generation, question answering, and conversational applications.
18
 
19
+ This version does not support ollama because tie_word_embeddings=True results in the absence of lm_head parameters being saved; therefore, ollama cannot be used. If ollama support is required, please choose the latest version [huihui-ai/Huihui-MoE-1.2B-A0.6B](https://huggingface.co/huihui-ai/Huihui-MoE-1.2B-A0.6B).
20
 
21
  - **Architecture**: Qwen3MoeForCausalLM model with 3 experts per layer (num_experts=3), activating 1 expert per token (num_experts_per_tok=1).
22
  - **Total Parameters**: ~1.1 billion (1B)