linyueqian's picture
Update README.md
0d6997a verified
metadata
language: en
license: mit
library_name: peft
base_model: llava-hf/llava-1.5-7b-hf
tags:
  - robotics
  - vision-language
  - task-detection
  - llava
datasets:
  - synthetic-data

Model Card for Unsolvable Robotic Task Detection

Model Details

  • Purpose: Detects when robotic tasks are impossible to complete
  • Base Model: LLaVA v1.5 7B
  • Developed by: Duke University
  • Type: Vision-Language Model

Use Cases

  • Identifying unsolvable robotic tasks in real-time
  • Explaining why tasks cannot be completed
  • Supporting safe human-robot interaction

Training Data

  • 4,920 synthetic images with question-answer pairs
  • Covers five categories: Status Conflicts, Item Absences, Logical Contradictions, Ambiguous Tasks, and Ethical Constraints

Performance

  • Success rate on SDXL synthetic data: 78.05%
  • Success rate on simulator synthetic data: 81.00%

Limitations

  • Works only with tasks similar to training data
  • Requires human oversight
  • May not catch novel types of impossible tasks

Getting Started

# Basic configuration
config = {
    "USE_LORA": True,
    "LORA_R": 8,
    "LORA_ALPHA": 8,
    "MODEL_MAX_LEN": 1024
}

Contact

{yixuan.yang,yueqian.lin}@duke.edu