Lucy-128k-dq68-mlx

175 tok/sec on a M4 Mac

Performance evaluation:

arc_challenge
acc 0.33, norm 0.36, stderr 0.014
arc_easy
acc 0.45, norm 0.38, stderr 0.009
boolq
acc 0.62, norm 0.62, stderr 0.008
hellaswag
acc 0.44, norm 0.52, stderr 0.004
openbookqa
acc 0.21, norm 0.38, stderr 0.021
piqa
acc 0.70, norm 0.69, stderr 0.010
winogrande
acc 0.55, norm 0.55, stderr 0.013

Performance evaluation of the source model:

21194/21194 [39:58<00:00,  8.84it/s]
arc_challenge
acc 0.34, norm 0.35, stderr 0.013
arc_easy
acc 0.46, norm 0.39, stderr 0.010
boolq
acc 0.62, norm 0.62, stderr 0.008
hellaswag
acc 0.44, norm 0.53, stderr 0.004
openbookqa
acc 0.23, norm 0.39, stderr 0.021
piqa
acc 0.70, norm 0.69, stderr 0.010
winogrande
acc 0.56, norm 0.55, stderr 0.013

This model Lucy-128k-dq68-mlx was converted to MLX format from Menlo/Lucy-128k using mlx-lm version 0.26.0.

Use with mlx

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Lucy-128k-dq68-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)
Downloads last month
12
Safetensors
Model size
431M params
Tensor type
BF16
·
U32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nightmedia/Lucy-128k-dq68-mlx

Finetuned
Qwen/Qwen3-1.7B
Finetuned
Menlo/Lucy-128k
Quantized
(20)
this model

Collections including nightmedia/Lucy-128k-dq68-mlx