Pico LM
Collection
Pre-training 11M parameter models on 29k, 5M, 10M, 20M, & 205M rows of Dolma dataset using pico-lm
β’
12 items
β’
Updated
Simple benchmark tool for running predefined prompts through all checkpoints of a model.
python benchmark.py [model_name] [options]
# Benchmark all checkpoints of a model
python benchmark.py pico-decoder-tiny-dolma5M-v1
# Specify custom output directory
python benchmark.py pico-decoder-tiny-dolma5M-v1 --output my_results/
# Use custom prompts file
python benchmark.py pico-decoder-tiny-dolma5M-v1 --prompts my_prompts.json
Prompts are stored in prompts.json
as a simple array of strings:
[
"Hello, how are you?",
"Complete this story: Once upon a time",
"What is the capital of France?"
]
Simply edit prompts.json
and add new prompt strings to the array. Super simple!
step_*
checkpoints automaticallyResults are saved as markdown files in results/
directory:
results/
βββ pico-decoder-tiny-dolma5M-v1_benchmark_20250101_120000.md
βββ pico-decoder-tiny-dolma29k-v3_benchmark_20250101_130000.md
βββ ...