FlameF0X's picture
Update README.md
6e45c13 verified
metadata
license: apache-2.0
datasets:
  - FlameF0X/DialogMLM-50K
language:
  - en
new_version: FlameF0X/SnowflakeCore-G0-Release-2.5-SFT
pipeline_tag: text-generation
library_name: transformers

SnowflakeCore-G0-Release-2.5

This is the initial release of the SnowflakeCore series language models, trained on the DialogMLM-50K dataset with optimized memory usage.

SUPPORT ME

You can support me via Ko-fi or you can try my Vast.ai template!

Model details

  • Architecture: SnowflakeCore
  • Hidden size: 768
  • Number of attention heads: 12
  • Number of layers: 8
  • Feed-forward dimension: 1536
  • Maximum sequence length: 768
  • Vocabulary size: 30524

HuggingFace Transformers Compatibility

This model is fully compatible with the HuggingFace Transformers library. You can load it using:

from transformers import AutoConfig, AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
config = AutoConfig.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")
model = AutoModel.from_pretrained("FlameF0X/SnowflakeCore-G0-Release-2.5")

Memory Optimization Techniques

  • Mixed precision training
  • Gradient accumulation (8 steps)
  • Fused QKV projection
  • Pre-norm architecture
  • Weight tying between embedding and output layers
  • Half-precision model storage

The model weights are stored in both PyTorch (.bin) and safetensors format for improved security, loading efficiency, and compatibility.