QuantFactory/llama-161M-100B-GGUF
This is quantized version of abacaj/llama-161M-100B created using llama.cpp
Model Description
Trained on 100B tokens.
- 1e-3 LR
- 0.1 wd
- WSD scheduler with 10% decay
- 80% code, 10% NL, 10% instruction data
- Dataset decontaminated against popular benchmarks following bigcode
- 8x3090s 110~ hours
This is a base pretrained model and requires further fine tuning to be useful.
Model Details
openai/openai_humaneval (greedy) | mbpp (greedy) |
---|---|
9.2% | 9.8% |
- Downloads last month
- 75
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for QuantFactory/llama-161M-100B-GGUF
Base model
abacaj/llama-161M-100B