File size: 3,633 Bytes
38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 38e1da9 f7f6c78 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
---
license: llama3.1
library_name: peft
base_model: meta-llama/Llama-3.2-1B
tags:
- llama
- lora
- qlora
- fine-tuned
- robotics
- task-planning
- construction
- dart-llm
language:
- en
pipeline_tag: text-generation
---
# Llama 3.2 1B - DART LLM Robot Task Planning (QLoRA Fine-tuned)
This model is a QLoRA fine-tuned version of **meta-llama/Llama-3.2-1B** specialized for **robot task planning** in construction environments.
The model converts natural language commands into structured task sequences for construction robots including excavators and dump trucks.
## Model Details
- **Base Model**: meta-llama/Llama-3.2-1B
- **Fine-tuning Method**: QLoRA (4-bit quantization + LoRA)
- **LoRA Rank**: 16-32 (optimized per model size)
- **LoRA Alpha**: 16-32
- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- **Dataset**: YongdongWang/dart_llm_tasks_pretty
- **Training Domain**: Construction robotics
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load tokenizer and base model
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.2-1B",
torch_dtype=torch.float16,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "YongdongWang/llama-3.2-1b-lora-qlora-dart-llm")
# Generate robot task sequence
instruction = "Deploy Excavator 1 to Soil Area 1 for excavation"
prompt = f"### Instruction:\n{instruction}\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
do_sample=False,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Training Details
- **Training Data**: DART LLM Tasks - Robot command and task planning dataset
- **Domain**: Construction robotics (excavators, dump trucks, soil/rock areas)
- **Training Epochs**: 6-12 (optimized per model size)
- **Batch Size**: 1 (with gradient accumulation)
- **Learning Rate**: 1e-4 to 3e-4 (scaled by model size)
- **Optimizer**: paged_adamw_8bit or adamw_torch
## Capabilities
- **Multi-robot coordination**: Handle multiple excavators and dump trucks
- **Task dependencies**: Generate proper task sequences with dependencies
- **Spatial reasoning**: Understand soil areas, rock areas, puddles, and navigation
- **Action planning**: Convert commands to structured JSON task definitions
## Example Output
The model generates structured task sequences in JSON format for robot execution:
```json
{
"tasks": [
{
"instruction_function": {
"dependencies": [],
"name": "target_area_for_specific_robots",
"object_keywords": ["soil_area_1"],
"robot_ids": ["robot_excavator_01"],
"robot_type": null
},
"task": "target_area_for_specific_robots_1"
}
]
}
```
## Limitations
This model is specifically trained for construction robotics scenarios and may not generalize to other domains without additional fine-tuning.
## Citation
```bibtex
@misc{llama_3.2_1b_lora_qlora_dart_llm,
title={Llama 3.2 1B Fine-tuned with QLoRA for DART LLM Tasks},
author={YongdongWang},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/YongdongWang/llama-3.2-1b-lora-qlora-dart-llm}
}
```
## Model Card Authors
YongdongWang
## Model Card Contact
For questions or issues, please open an issue in the repository or contact the model author.
|