llm.c checkpoint: GPT-2 774M

This is a HF/safetensors conversion of the llm.c checkpoint of a 774M parameter run on 150B tokens from FineWeb.

Training was conducted on a single 8xA100 80GB SXM node for ~6 days.

See discussion on GitHub for more information.

Downloads last month
175
Safetensors
Model size
774M params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Dataset used to train mdouglas/llmc-gpt2-774M-150B

Collection including mdouglas/llmc-gpt2-774M-150B