SnowflakeCore-G0-Release-2.5

This is the initial release of the SnowflakeCore series language models, trained on the DialogMLM-50K dataset with optimized memory usage.

SUPPORT ME

You can support me via Ko-fi or you can try my Vast.ai template!

Model details

Architecture: SnowflakeCore
Hidden size: 768
Number of attention heads: 12
Number of layers: 8
Feed-forward dimension: 1536
Maximum sequence length: 768
Vocabulary size: 30524

HuggingFace Transformers Compatibility

This model is fully compatible with the HuggingFace Transformers library. You can load it using:

from transformers import AutoConfig, AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
config = AutoConfig.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
model = AutoModel.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")

Memory Optimization Techniques

Mixed precision training
Gradient accumulation (8 steps)
Fused QKV projection
Pre-norm architecture
Weight tying between embedding and output layers
Half-precision model storage

The model weights are stored in both PyTorch (.bin) and safetensors format for improved security, loading efficiency, and compatibility.

FlameF0X
/

SnowflakeCore-G0-Release-2.5

SnowflakeCore-G0-Release-2.5

SUPPORT ME

Model details

HuggingFace Transformers Compatibility

Memory Optimization Techniques

Dataset used to train FlameF0X/SnowflakeCore-G0-Release-2.5

Space using FlameF0X/SnowflakeCore-G0-Release-2.5 1

Collection including FlameF0X/SnowflakeCore-G0-Release-2.5

SnowflakeCore G0