metadata

license: apache-2.0
datasets:
  - FlameF0X/DialogMLM-50K
language:
  - en
new_version: FlameF0X/SnowflakeCore-G0-Release-2.5-SFT
pipeline_tag: text-generation
library_name: transformers

SnowflakeCore-G0-Release-2.5

This is the initial release of the SnowflakeCore series language models, trained on the DialogMLM-50K dataset with optimized memory usage.

SUPPORT ME

You can support me via Ko-fi or you can try my Vast.ai template!

Model details

Architecture: SnowflakeCore
Hidden size: 768
Number of attention heads: 12
Number of layers: 8
Feed-forward dimension: 1536
Maximum sequence length: 768
Vocabulary size: 30524

HuggingFace Transformers Compatibility

This model is fully compatible with the HuggingFace Transformers library. You can load it using:

from transformers import AutoConfig, AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
config = AutoConfig.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
model = AutoModel.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")

Memory Optimization Techniques

Mixed precision training
Gradient accumulation (8 steps)
Fused QKV projection
Pre-norm architecture
Weight tying between embedding and output layers
Half-precision model storage

The model weights are stored in both PyTorch (.bin) and safetensors format for improved security, loading efficiency, and compatibility.