Tri-7B-Base

Tri-7B-Base

Introduction

We present Tri-7B-Base, a foundation language model that serves as the pre-trained base for our Tri-7B model family. This model represents our commitment to efficient training while establishing a strong foundation for downstream fine-tuning and adaptation.

Key Features

  • Foundation Architecture: State-of-the-art transformer architecture optimized for efficiency
  • Multi-lingual Foundation: Pre-trained on diverse data in Korean, English, and Japanese
  • Efficient Training: Optimized training methodology for computational efficiency

Model Specifications

Tri-7B-Base

  • Type: Causal Language Model
  • Training Stage: Pre-training
  • Architecture: Transformer Decoder with RoPE, SwiGLU, RMSNorm
  • Number of Parameters: 7.76B
  • Number of Layers: 32
  • Number of Attention Heads: 32
  • Context Length: 4,096
  • Vocab Size: 128,128

Use Cases

As a base model, Tri-7B-Base is designed to serve as a foundation for various downstream applications:

  • Fine-tuning: Adapt to specific domains or tasks
  • Instruction Tuning: Create chat or assistant models
  • Domain Specialization: Customize for specific industries or use cases
  • Research: Explore model behaviors and capabilities
  • Language Generation: General text completion and generation tasks

Limitations

  • Base Model Nature: This is a pre-trained base model without instruction tuning or alignment. For chat or assistant capabilities, consider fine-tuned variants.
  • Language Support: The model is optimized for English, Korean, and Japanese. Usage with other languages may result in degraded performance.
  • Knowledge Cutoff: The model's information is limited to data available up to February, 2025.
  • Generation Quality: As a base model, outputs may require post-processing or filtering for production use cases.

License

This model is licensed under the Apache License 2.0.

Contact

For inquiries, please contact: [email protected]

Downloads last month
18
Safetensors
Model size
7.53B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including trillionlabs/Tri-7B-Base