SnowflakeCore-G0-Release-2.5

This is the initial release of the SnowflakeCore series language models, trained on the DialogMLM-50K dataset with optimized memory usage.

SUPPORT ME

You can support me via Ko-fi or you can try my Vast.ai template!

Model details

  • Architecture: SnowflakeCore
  • Hidden size: 768
  • Number of attention heads: 12
  • Number of layers: 8
  • Feed-forward dimension: 1536
  • Maximum sequence length: 768
  • Vocabulary size: 30524

HuggingFace Transformers Compatibility

This model is fully compatible with the HuggingFace Transformers library. You can load it using:

from transformers import AutoConfig, AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
config = AutoConfig.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
model = AutoModel.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")

Memory Optimization Techniques

  • Mixed precision training
  • Gradient accumulation (8 steps)
  • Fused QKV projection
  • Pre-norm architecture
  • Weight tying between embedding and output layers
  • Half-precision model storage

The model weights are stored in both PyTorch (.bin) and safetensors format for improved security, loading efficiency, and compatibility.

Downloads last month
7
Safetensors
Model size
61.9M params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train FlameF0X/SnowflakeCore-G0-Release-2.5

Space using FlameF0X/SnowflakeCore-G0-Release-2.5 1

Collection including FlameF0X/SnowflakeCore-G0-Release-2.5