Model Card for Model ID

Llama-3.2-1B-bnb-4bit trained on OpenO1-SFT dataset to behave and provide thoughful reasoning abilities, while being fast.

Model Details

Model Description

Llama-3.2-1B-bnb-4bit trained on OpenO1-SFT dataset to behave and provide thoughful reasoning abilities, while being fast. 100 runs for 1 epoch and rsLORA technique used.

Developed by: CYFARE ( https://cyfare.net/ | https://github.com/cyfare/ )
Model type: Large Language Model (LLM)
Finetuned from model: LLAMA 3.1 1B 4bit

Training Details

Training Settings

    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_alpha=16,
    lora_dropout=0,
    bias="none",
    use_gradient_checkpointing="unsloth",
    random_state=3407,
    use_rslora=True,
    loftq_config=None,

    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    data_collator=DataCollatorForSeq2Seq(tokenizer=tokenizer),
    dataset_num_proc=2,
    packing=False,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        warmup_steps=10,
        num_train_epochs=1,
        max_steps=100,
        learning_rate=2e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=1,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=3407,
        output_dir="outputs",
        report_to="none",

Model Card Authors

CYFARE ( https://cyfare.net/ | https://github.com/cyfare/ )

Model Card Contact