Chat-UniVi
/

MoE-Plus-Plus-7B

Text Generation

Model card Files Files and versions Community

Chat-UniVi commited on Oct 17, 2024

Commit

8f0446c

·

verified ·

1 Parent(s): cc60b12

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ license: apache-2.0
 # MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
 **Paper or resources for more information:**
-[[Paper]()] [[Code](https://github.com/SkyworkAI/MoE-plus-plus)]
 ## ⚡ Overview
 We introduce three types of zero-computation experts: the zero expert, copy expert, and constant expert, which correspond to discard, skip, and replace operations, respectively. Moreover, we leverage gating residuals, enabling each token to consider the pathway taken in the previous layer when selecting the appropriate experts.

 # MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
 **Paper or resources for more information:**
+[[Paper](https://huggingface.co/papers/2410.07348)] [[Code](https://github.com/SkyworkAI/MoE-plus-plus)]
 ## ⚡ Overview
 We introduce three types of zero-computation experts: the zero expert, copy expert, and constant expert, which correspond to discard, skip, and replace operations, respectively. Moreover, we leverage gating residuals, enabling each token to consider the pathway taken in the previous layer when selecting the appropriate experts.