QuantFactory/llama-161M-100B-GGUF

This is quantized version of abacaj/llama-161M-100B created using llama.cpp

Model Description

Trained on 100B tokens.

  • 1e-3 LR
  • 0.1 wd
  • WSD scheduler with 10% decay
  • 80% code, 10% NL, 10% instruction data
  • Dataset decontaminated against popular benchmarks following bigcode
  • 8x3090s 110~ hours

This is a base pretrained model and requires further fine tuning to be useful.

Model Details

openai/openai_humaneval (greedy) mbpp (greedy)
9.2% 9.8%
Downloads last month
75
GGUF
Model size
162M params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for QuantFactory/llama-161M-100B-GGUF

Quantized
(4)
this model