metadata
license: apache-2.0
datasets:
- FlameF0X/DialogMLM-50K
language:
- en
new_version: FlameF0X/SnowflakeCore-G0-Release-2.5-SFT
pipeline_tag: text-generation
library_name: transformers
SnowflakeCore-G0-Release-2.5
This is the initial release of the SnowflakeCore series language models, trained on the DialogMLM-50K dataset with optimized memory usage.
SUPPORT ME
You can support me via Ko-fi or you can try my Vast.ai template!
Model details
- Architecture: SnowflakeCore
- Hidden size: 768
- Number of attention heads: 12
- Number of layers: 8
- Feed-forward dimension: 1536
- Maximum sequence length: 768
- Vocabulary size: 30524
HuggingFace Transformers Compatibility
This model is fully compatible with the HuggingFace Transformers library. You can load it using:
from transformers import AutoConfig, AutoModel, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
config = AutoConfig.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
model = AutoModel.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
Memory Optimization Techniques
- Mixed precision training
- Gradient accumulation (8 steps)
- Fused QKV projection
- Pre-norm architecture
- Weight tying between embedding and output layers
- Half-precision model storage
The model weights are stored in both PyTorch (.bin) and safetensors format for improved security, loading efficiency, and compatibility.