File size: 3,633 Bytes

e1e77b6
635e0cd
79b4a6a
 
7b85b83
79b4a6a
 
 
 
 
 
 
635e0cd
7b85b83
 
 
 
 
635e0cd
79b4a6a
635e0cd
79b4a6a
 
 
 
 
 
 
635e0cd
 
79b4a6a
635e0cd
 
79b4a6a
 
7b85b83
 
 
 
635e0cd
7b85b83
79b4a6a
7b85b83
635e0cd
 
 
 
 
79b4a6a
 
7bf5d0d
d4f832d
79b4a6a
635e0cd
 
 
 
 
 
 
 
 
 
 
 
79b4a6a
 
7b85b83
 
79b4a6a
 
 
 
635e0cd
 
 
 
79b4a6a
 
 
 
 
 
 
 
 
 
635e0cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d4f832d
79b4a6a
d4f832d
635e0cd
 
 
 
 
7bf5d0d
635e0cd
 
 
 
7bf5d0d
635e0cd

---
license: llama3.1
library_name: peft
base_model: meta-llama/Llama-3.1-8B
tags:
- llama
- lora
- qlora
- fine-tuned
- robotics
- task-planning
- construction
- dart-llm
language:
- en
pipeline_tag: text-generation
---

# Llama 3.1 8B - DART LLM Robot Task Planning (QLoRA Fine-tuned)

This model is a QLoRA fine-tuned version of **meta-llama/Llama-3.1-8B** specialized for **robot task planning** in construction environments.

The model converts natural language commands into structured task sequences for construction robots including excavators and dump trucks.

## Model Details

- **Base Model**: meta-llama/Llama-3.1-8B
- **Fine-tuning Method**: QLoRA (4-bit quantization + LoRA)
- **LoRA Rank**: 16-32 (optimized per model size)
- **LoRA Alpha**: 16-32
- **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- **Dataset**: YongdongWang/dart_llm_tasks_pretty
- **Training Domain**: Construction robotics

## Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load tokenizer and base model
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B")
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "YongdongWang/llama-3.1-8b-lora-qlora-dart-llm")

# Generate robot task sequence
instruction = "Deploy Excavator 1 to Soil Area 1 for excavation"
prompt = f"### Instruction:\n{instruction}\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

## Training Details

- **Training Data**: DART LLM Tasks - Robot command and task planning dataset
- **Domain**: Construction robotics (excavators, dump trucks, soil/rock areas)
- **Training Epochs**: 6-12 (optimized per model size)
- **Batch Size**: 1 (with gradient accumulation)
- **Learning Rate**: 1e-4 to 3e-4 (scaled by model size)
- **Optimizer**: paged_adamw_8bit or adamw_torch

## Capabilities

- **Multi-robot coordination**: Handle multiple excavators and dump trucks
- **Task dependencies**: Generate proper task sequences with dependencies
- **Spatial reasoning**: Understand soil areas, rock areas, puddles, and navigation
- **Action planning**: Convert commands to structured JSON task definitions

## Example Output

The model generates structured task sequences in JSON format for robot execution:

```json
{
  "tasks": [
    {
      "instruction_function": {
        "dependencies": [],
        "name": "target_area_for_specific_robots",
        "object_keywords": ["soil_area_1"],
        "robot_ids": ["robot_excavator_01"],
        "robot_type": null
      },
      "task": "target_area_for_specific_robots_1"
    }
  ]
}
```

## Limitations

This model is specifically trained for construction robotics scenarios and may not generalize to other domains without additional fine-tuning.

## Citation

```bibtex
@misc{llama_3.1_8b_lora_qlora_dart_llm,
  title={Llama 3.1 8B Fine-tuned with QLoRA for DART LLM Tasks},
  author={YongdongWang},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/YongdongWang/llama-3.1-8b-lora-qlora-dart-llm}
}
```

## Model Card Authors

YongdongWang

## Model Card Contact

For questions or issues, please open an issue in the repository or contact the model author.