YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quasar-40B

Quasar-40B is a high-performance language model based on a renamed and enhanced version of the Klear-46B-A2.5B-Base model by Kwai-Klear. It leverages a Mixture of Experts (MoE) Transformer architecture, pretrained on a massive dataset of 22 trillion tokens. Quasar-40B introduces key improvements, including an increased number of experts per token, a Positional Memory Bank, and the removal of caching mechanisms for optimized performance.

About Quasar-40B

Quasar-40B is designed for state-of-the-art natural language processing with enhanced efficiency and scalability. Building on the Klear-46B-A2.5B-Base model, it addresses limitations and introduces advanced features:

  • Increased Experts per Token: More experts per token in the MoE architecture for better specialization and performance.
  • Positional Memory Bank: Enhances context retention for long-sequence tasks, improving coherence and accuracy.
  • Removed Caching: Eliminates caching to reduce memory overhead and improve inference speed.

These enhancements make Quasar-40B ideal for research, production deployment, and fine-tuning for specialized tasks.

Model Details

  • Base Model: Kwai-Klear/Klear-46B-A2.5B-Base
  • Architecture: Mixture of Experts (MoE) Transformer
  • Model Type: Quasar
  • Pretraining Data: 22 trillion tokens from diverse text sources
  • Key Enhancements:
    • Increased number of experts per token
    • Positional Memory Bank for improved context retention
    • Removed caching for faster inference and lower memory usage

VL Support Soon

Downloads last month
22
Safetensors
Model size
46B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support