OpenO1_LLAMA-3.2-1B / README.md
CYFARE's picture
Upload 6 files
fed3b09 verified
---
base_model: unsloth/Llama-3.2-1B-bnb-4bit
library_name: peft
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
Llama-3.2-1B-bnb-4bit trained on OpenO1-SFT dataset to behave and provide thoughful reasoning abilities, while being fast.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
Llama-3.2-1B-bnb-4bit trained on OpenO1-SFT dataset to behave and provide thoughful reasoning abilities, while being fast. 100 runs for 1 epoch and rsLORA technique used.
- **Developed by:** CYFARE ( https://cyfare.net/ | https://github.com/cyfare/ )
- **Model type:** Large Language Model (LLM)
- **Finetuned from model:** LLAMA 3.1 1B 4bit
## Training Details
### Training Settings
```
r=16,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
lora_alpha=16,
lora_dropout=0,
bias="none",
use_gradient_checkpointing="unsloth",
random_state=3407,
use_rslora=True,
loftq_config=None,
```
```
model=model,
tokenizer=tokenizer,
train_dataset=dataset,
dataset_text_field="text",
max_seq_length=max_seq_length,
data_collator=DataCollatorForSeq2Seq(tokenizer=tokenizer),
dataset_num_proc=2,
packing=False,
args=TrainingArguments(
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
warmup_steps=10,
num_train_epochs=1,
max_steps=100,
learning_rate=2e-4,
fp16=not is_bfloat16_supported(),
bf16=is_bfloat16_supported(),
logging_steps=1,
optim="adamw_8bit",
weight_decay=0.01,
lr_scheduler_type="linear",
seed=3407,
output_dir="outputs",
report_to="none",
```
## Model Card Authors
CYFARE ( https://cyfare.net/ | https://github.com/cyfare/ )
## Model Card Contact
CYFARE ( https://cyfare.net/ | https://github.com/cyfare/ )
### Framework versions
- PEFT 0.14.0