YongdongWang commited on
Commit
7b85b83
·
verified ·
1 Parent(s): d6c53f6

Upload fine-tuned Llama 3.1 8B QLoRA model

Browse files
README.md CHANGED
@@ -1,65 +1,75 @@
1
-
2
  ---
3
- library_name: peft
4
- base_model: meta-llama/Llama-3.1-8B
5
- tags:
6
- - llama
7
- - lora
8
- - qlora
9
- - fine-tuned
10
- license: llama3.1
11
- language:
12
- - en
13
- pipeline_tag: text-generation
14
- ---
15
-
16
- # Llama 3.1 8B - Robot Task Planning (QLoRA Fine-tuned)
17
-
18
- This model is a fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) specialized for **robot task planning** using QLoRA (4-bit quantization + LoRA).
19
-
20
- The model converts natural language robot commands into structured task sequences for construction robots including excavators and dump trucks.
21
-
22
- ## Model Details
23
-
24
- - **Base Model**: meta-llama/Llama-3.1-8B
25
- - **Fine-tuning Method**: QLoRA (4-bit quantization + LoRA)
26
- - **LoRA Rank**: 16
27
- - **LoRA Alpha**: 32
28
- - **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
29
-
30
- ## Usage
31
-
32
- ```python
33
- from transformers import AutoTokenizer, AutoModelForCausalLM
34
- from peft import PeftModel
35
-
36
- # Load tokenizer and base model
37
- tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B")
38
- base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B")
39
-
40
- # Load LoRA adapter
41
- model = PeftModel.from_pretrained(base_model, "YongdongWang/llama-3.1-8b-dart-qlora")
42
-
43
- # Generate robot task sequence
44
- command = "Deploy Excavator 1 to Soil Area 1 for excavation"
45
- inputs = tokenizer(command, return_tensors="pt")
46
- outputs = model.generate(**inputs, max_new_tokens=5120)
47
- response = tokenizer.decode(outputs[0], skip_special_tokens=True)
48
- print(response)
49
-
50
- Training Details
51
-
52
- Training Data: DART LLM Tasks - Robot command and task planning dataset
53
- Domain: Construction robotics (excavators, dump trucks, soil/rock areas)
54
- Training Epochs: 5
55
- Batch Size: 16 (with gradient accumulation)
56
- Learning Rate: 2e-4
57
- Optimizer: paged_adamw_8bit
58
-
59
- Capabilities
60
-
61
- Multi-robot coordination: Handle multiple excavators and dump trucks
62
- Task dependencies: Generate proper task sequences with dependencies
63
- Spatial reasoning: Understand soil areas, rock areas, puddles, and navigation
64
- Action planning: Convert commands to structured JSON task definitions
65
-
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: peft
3
+ base_model: meta-llama/Llama-3.1-8B
4
+ tags:
5
+ - llama
6
+ - lora
7
+ - qlora
8
+ - fine-tuned
9
+ - robotics
10
+ - task-planning
11
+ - construction
12
+ license: llama3.1
13
+ language:
14
+ - en
15
+ pipeline_tag: text-generation
16
+ ---
17
+
18
+ # Llama 3.1 8B - Robot Task Planning (QLoRA Fine-tuned)
19
+
20
+ This model is a QLoRA fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) specialized for **robot task planning** in construction environments.
21
+
22
+ The model converts natural language commands into structured task sequences for construction robots including excavators and dump trucks.
23
+
24
+ ## Model Details
25
+
26
+ - **Base Model**: meta-llama/Llama-3.1-8B
27
+ - **Fine-tuning Method**: QLoRA (4-bit quantization + LoRA)
28
+ - **LoRA Rank**: 16
29
+ - **LoRA Alpha**: 32
30
+ - **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
31
+
32
+ ## Usage
33
+
34
+ ```python
35
+ from transformers import AutoTokenizer, AutoModelForCausalLM
36
+ from peft import PeftModel
37
+
38
+ # Load tokenizer and base model
39
+ tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B")
40
+ base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B")
41
+
42
+ # Load LoRA adapter
43
+ model = PeftModel.from_pretrained(base_model, "YongdongWang/llama-3.1-8b-dart-qlora")
44
+
45
+ # Generate robot task sequence
46
+ command = "Deploy Excavator 1 to Soil Area 1 for excavation"
47
+ inputs = tokenizer(command, return_tensors="pt")
48
+ outputs = model.generate(**inputs, max_new_tokens=512)
49
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
50
+ print(response)
51
+ ```
52
+
53
+ ## Training Details
54
+
55
+ - **Training Data**: DART LLM Tasks - Robot command and task planning dataset
56
+ - **Domain**: Construction robotics (excavators, dump trucks, soil/rock areas)
57
+ - **Training Epochs**: 5
58
+ - **Batch Size**: 16 (with gradient accumulation)
59
+ - **Learning Rate**: 2e-4
60
+ - **Optimizer**: paged_adamw_8bit
61
+
62
+ ## Capabilities
63
+
64
+ - **Multi-robot coordination**: Handle multiple excavators and dump trucks
65
+ - **Task dependencies**: Generate proper task sequences with dependencies
66
+ - **Spatial reasoning**: Understand soil areas, rock areas, puddles, and navigation
67
+ - **Action planning**: Convert commands to structured JSON task definitions
68
+
69
+ ## Example Output
70
+
71
+ The model generates structured task sequences in JSON format for robot execution.
72
+
73
+ ## Limitations
74
+
75
+ This model is specifically trained for construction robotics scenarios and may not generalize to other domains without additional fine-tuning.
adapter_config.json CHANGED
@@ -25,12 +25,12 @@
25
  "revision": null,
26
  "target_modules": [
27
  "k_proj",
28
- "down_proj",
29
- "gate_proj",
30
  "v_proj",
 
31
  "o_proj",
32
- "up_proj",
33
- "q_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
 
25
  "revision": null,
26
  "target_modules": [
27
  "k_proj",
28
+ "up_proj",
 
29
  "v_proj",
30
+ "gate_proj",
31
  "o_proj",
32
+ "q_proj",
33
+ "down_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:686aa6705c3b268bfc62872bc6a20ee3f78af943123a851082ea5a8beecf764b
3
  size 167832240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0db982df9cb0f4027019591b4dbed63879790c994db46e7854dc9c28a18915fd
3
  size 167832240
checkpoint-24/adapter_config.json CHANGED
@@ -25,12 +25,12 @@
25
  "revision": null,
26
  "target_modules": [
27
  "k_proj",
28
- "down_proj",
29
- "gate_proj",
30
  "v_proj",
 
31
  "o_proj",
32
- "up_proj",
33
- "q_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
 
25
  "revision": null,
26
  "target_modules": [
27
  "k_proj",
28
+ "up_proj",
 
29
  "v_proj",
30
+ "gate_proj",
31
  "o_proj",
32
+ "q_proj",
33
+ "down_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
checkpoint-24/adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ff80ada358f7d45551b795de60e3448a3be3b37ca434a09529d49259e740e104
3
  size 167832240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:147edd45999f9fd86edaa853b1974dff9b4b86f3dc23bbf9aebbc796ef8e4782
3
  size 167832240
checkpoint-24/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6b5ab0f12eed125e6b567965fc0df61f90b689c01a7cf9d5b32828e25f5b9211
3
  size 85728532
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bb5da80ca3998ac6feb84f67496a67e5dc7c7b0dc77a7c882b36663fd205e22c
3
  size 85728532
checkpoint-24/trainer_state.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "best_global_step": 24,
3
- "best_metric": 0.026073265820741653,
4
  "best_model_checkpoint": "./outputs/llama3.1-8b-lora-qlora-dart-llm/checkpoint-24",
5
  "epoch": 4.0,
6
  "eval_steps": 500,
@@ -11,62 +11,62 @@
11
  "log_history": [
12
  {
13
  "epoch": 0.8791208791208791,
14
- "grad_norm": 0.5106796026229858,
15
  "learning_rate": 0.00019594929736144976,
16
- "loss": 0.6176,
17
  "step": 5
18
  },
19
  {
20
  "epoch": 1.0,
21
- "eval_loss": 0.15218934416770935,
22
- "eval_runtime": 2.7083,
23
- "eval_samples_per_second": 4.062,
24
- "eval_steps_per_second": 4.062,
25
  "step": 6
26
  },
27
  {
28
  "epoch": 1.7032967032967035,
29
- "grad_norm": 0.4140273630619049,
30
  "learning_rate": 0.00015406408174555976,
31
- "loss": 0.1118,
32
  "step": 10
33
  },
34
  {
35
  "epoch": 2.0,
36
- "eval_loss": 0.04035872593522072,
37
- "eval_runtime": 2.7366,
38
- "eval_samples_per_second": 4.02,
39
- "eval_steps_per_second": 4.02,
40
  "step": 12
41
  },
42
  {
43
  "epoch": 2.5274725274725274,
44
- "grad_norm": 0.19475381076335907,
45
  "learning_rate": 8.57685161726715e-05,
46
- "loss": 0.0205,
47
  "step": 15
48
  },
49
  {
50
  "epoch": 3.0,
51
- "eval_loss": 0.029018037021160126,
52
- "eval_runtime": 2.6384,
53
- "eval_samples_per_second": 4.169,
54
- "eval_steps_per_second": 4.169,
55
  "step": 18
56
  },
57
  {
58
  "epoch": 3.3516483516483517,
59
- "grad_norm": 0.0900561586022377,
60
  "learning_rate": 2.4425042564574184e-05,
61
- "loss": 0.0216,
62
  "step": 20
63
  },
64
  {
65
  "epoch": 4.0,
66
- "eval_loss": 0.026073265820741653,
67
- "eval_runtime": 2.7519,
68
- "eval_samples_per_second": 3.997,
69
- "eval_steps_per_second": 3.997,
70
  "step": 24
71
  }
72
  ],
 
1
  {
2
  "best_global_step": 24,
3
+ "best_metric": 0.02759450115263462,
4
  "best_model_checkpoint": "./outputs/llama3.1-8b-lora-qlora-dart-llm/checkpoint-24",
5
  "epoch": 4.0,
6
  "eval_steps": 500,
 
11
  "log_history": [
12
  {
13
  "epoch": 0.8791208791208791,
14
+ "grad_norm": 0.5126284956932068,
15
  "learning_rate": 0.00019594929736144976,
16
+ "loss": 0.6177,
17
  "step": 5
18
  },
19
  {
20
  "epoch": 1.0,
21
+ "eval_loss": 0.13462547957897186,
22
+ "eval_runtime": 1.9882,
23
+ "eval_samples_per_second": 5.533,
24
+ "eval_steps_per_second": 5.533,
25
  "step": 6
26
  },
27
  {
28
  "epoch": 1.7032967032967035,
29
+ "grad_norm": 4.223966598510742,
30
  "learning_rate": 0.00015406408174555976,
31
+ "loss": 0.113,
32
  "step": 10
33
  },
34
  {
35
  "epoch": 2.0,
36
+ "eval_loss": 0.041809309273958206,
37
+ "eval_runtime": 1.9899,
38
+ "eval_samples_per_second": 5.528,
39
+ "eval_steps_per_second": 5.528,
40
  "step": 12
41
  },
42
  {
43
  "epoch": 2.5274725274725274,
44
+ "grad_norm": 0.15816885232925415,
45
  "learning_rate": 8.57685161726715e-05,
46
+ "loss": 0.0253,
47
  "step": 15
48
  },
49
  {
50
  "epoch": 3.0,
51
+ "eval_loss": 0.029711483046412468,
52
+ "eval_runtime": 2.0074,
53
+ "eval_samples_per_second": 5.48,
54
+ "eval_steps_per_second": 5.48,
55
  "step": 18
56
  },
57
  {
58
  "epoch": 3.3516483516483517,
59
+ "grad_norm": 0.10277726501226425,
60
  "learning_rate": 2.4425042564574184e-05,
61
+ "loss": 0.0245,
62
  "step": 20
63
  },
64
  {
65
  "epoch": 4.0,
66
+ "eval_loss": 0.02759450115263462,
67
+ "eval_runtime": 1.9936,
68
+ "eval_samples_per_second": 5.518,
69
+ "eval_steps_per_second": 5.518,
70
  "step": 24
71
  }
72
  ],
checkpoint-24/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5c790f6569290a09d32f8a5b1e2904334b6f58a1c2321c61a41455b1cc5658b8
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c7263e7d5295b74152f3cc3f85217bffc73db5e4dd4a80e6aeb76028cabe1f6
3
  size 5432
checkpoint-25/adapter_config.json CHANGED
@@ -25,12 +25,12 @@
25
  "revision": null,
26
  "target_modules": [
27
  "k_proj",
28
- "down_proj",
29
- "gate_proj",
30
  "v_proj",
 
31
  "o_proj",
32
- "up_proj",
33
- "q_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
 
25
  "revision": null,
26
  "target_modules": [
27
  "k_proj",
28
+ "up_proj",
 
29
  "v_proj",
30
+ "gate_proj",
31
  "o_proj",
32
+ "q_proj",
33
+ "down_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
checkpoint-25/adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:686aa6705c3b268bfc62872bc6a20ee3f78af943123a851082ea5a8beecf764b
3
  size 167832240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0db982df9cb0f4027019591b4dbed63879790c994db46e7854dc9c28a18915fd
3
  size 167832240
checkpoint-25/optimizer.pt CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0b2d76bcc892f6aa50acb4b2f920ced51c29d48a2e262a4e75f95ad24696d858
3
  size 85728532
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca872ede0234294689c0eb7b7851107caef12b43187810d77b7973fa2d241b01
3
  size 85728532
checkpoint-25/trainer_state.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "best_global_step": 25,
3
- "best_metric": 0.02604079246520996,
4
  "best_model_checkpoint": "./outputs/llama3.1-8b-lora-qlora-dart-llm/checkpoint-25",
5
  "epoch": 4.175824175824176,
6
  "eval_steps": 500,
@@ -11,77 +11,77 @@
11
  "log_history": [
12
  {
13
  "epoch": 0.8791208791208791,
14
- "grad_norm": 0.5106796026229858,
15
  "learning_rate": 0.00019594929736144976,
16
- "loss": 0.6176,
17
  "step": 5
18
  },
19
  {
20
  "epoch": 1.0,
21
- "eval_loss": 0.15218934416770935,
22
- "eval_runtime": 2.7083,
23
- "eval_samples_per_second": 4.062,
24
- "eval_steps_per_second": 4.062,
25
  "step": 6
26
  },
27
  {
28
  "epoch": 1.7032967032967035,
29
- "grad_norm": 0.4140273630619049,
30
  "learning_rate": 0.00015406408174555976,
31
- "loss": 0.1118,
32
  "step": 10
33
  },
34
  {
35
  "epoch": 2.0,
36
- "eval_loss": 0.04035872593522072,
37
- "eval_runtime": 2.7366,
38
- "eval_samples_per_second": 4.02,
39
- "eval_steps_per_second": 4.02,
40
  "step": 12
41
  },
42
  {
43
  "epoch": 2.5274725274725274,
44
- "grad_norm": 0.19475381076335907,
45
  "learning_rate": 8.57685161726715e-05,
46
- "loss": 0.0205,
47
  "step": 15
48
  },
49
  {
50
  "epoch": 3.0,
51
- "eval_loss": 0.029018037021160126,
52
- "eval_runtime": 2.6384,
53
- "eval_samples_per_second": 4.169,
54
- "eval_steps_per_second": 4.169,
55
  "step": 18
56
  },
57
  {
58
  "epoch": 3.3516483516483517,
59
- "grad_norm": 0.0900561586022377,
60
  "learning_rate": 2.4425042564574184e-05,
61
- "loss": 0.0216,
62
  "step": 20
63
  },
64
  {
65
  "epoch": 4.0,
66
- "eval_loss": 0.026073265820741653,
67
- "eval_runtime": 2.7519,
68
- "eval_samples_per_second": 3.997,
69
- "eval_steps_per_second": 3.997,
70
  "step": 24
71
  },
72
  {
73
  "epoch": 4.175824175824176,
74
- "grad_norm": 0.09460670500993729,
75
  "learning_rate": 0.0,
76
- "loss": 0.0157,
77
  "step": 25
78
  },
79
  {
80
  "epoch": 4.175824175824176,
81
- "eval_loss": 0.02604079246520996,
82
- "eval_runtime": 2.8313,
83
- "eval_samples_per_second": 3.885,
84
- "eval_steps_per_second": 3.885,
85
  "step": 25
86
  }
87
  ],
 
1
  {
2
  "best_global_step": 25,
3
+ "best_metric": 0.0275871679186821,
4
  "best_model_checkpoint": "./outputs/llama3.1-8b-lora-qlora-dart-llm/checkpoint-25",
5
  "epoch": 4.175824175824176,
6
  "eval_steps": 500,
 
11
  "log_history": [
12
  {
13
  "epoch": 0.8791208791208791,
14
+ "grad_norm": 0.5126284956932068,
15
  "learning_rate": 0.00019594929736144976,
16
+ "loss": 0.6177,
17
  "step": 5
18
  },
19
  {
20
  "epoch": 1.0,
21
+ "eval_loss": 0.13462547957897186,
22
+ "eval_runtime": 1.9882,
23
+ "eval_samples_per_second": 5.533,
24
+ "eval_steps_per_second": 5.533,
25
  "step": 6
26
  },
27
  {
28
  "epoch": 1.7032967032967035,
29
+ "grad_norm": 4.223966598510742,
30
  "learning_rate": 0.00015406408174555976,
31
+ "loss": 0.113,
32
  "step": 10
33
  },
34
  {
35
  "epoch": 2.0,
36
+ "eval_loss": 0.041809309273958206,
37
+ "eval_runtime": 1.9899,
38
+ "eval_samples_per_second": 5.528,
39
+ "eval_steps_per_second": 5.528,
40
  "step": 12
41
  },
42
  {
43
  "epoch": 2.5274725274725274,
44
+ "grad_norm": 0.15816885232925415,
45
  "learning_rate": 8.57685161726715e-05,
46
+ "loss": 0.0253,
47
  "step": 15
48
  },
49
  {
50
  "epoch": 3.0,
51
+ "eval_loss": 0.029711483046412468,
52
+ "eval_runtime": 2.0074,
53
+ "eval_samples_per_second": 5.48,
54
+ "eval_steps_per_second": 5.48,
55
  "step": 18
56
  },
57
  {
58
  "epoch": 3.3516483516483517,
59
+ "grad_norm": 0.10277726501226425,
60
  "learning_rate": 2.4425042564574184e-05,
61
+ "loss": 0.0245,
62
  "step": 20
63
  },
64
  {
65
  "epoch": 4.0,
66
+ "eval_loss": 0.02759450115263462,
67
+ "eval_runtime": 1.9936,
68
+ "eval_samples_per_second": 5.518,
69
+ "eval_steps_per_second": 5.518,
70
  "step": 24
71
  },
72
  {
73
  "epoch": 4.175824175824176,
74
+ "grad_norm": 0.10169998556375504,
75
  "learning_rate": 0.0,
76
+ "loss": 0.0178,
77
  "step": 25
78
  },
79
  {
80
  "epoch": 4.175824175824176,
81
+ "eval_loss": 0.0275871679186821,
82
+ "eval_runtime": 1.9861,
83
+ "eval_samples_per_second": 5.539,
84
+ "eval_steps_per_second": 5.539,
85
  "step": 25
86
  }
87
  ],
checkpoint-25/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5c790f6569290a09d32f8a5b1e2904334b6f58a1c2321c61a41455b1cc5658b8
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c7263e7d5295b74152f3cc3f85217bffc73db5e4dd4a80e6aeb76028cabe1f6
3
  size 5432
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5c790f6569290a09d32f8a5b1e2904334b6f58a1c2321c61a41455b1cc5658b8
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c7263e7d5295b74152f3cc3f85217bffc73db5e4dd4a80e6aeb76028cabe1f6
3
  size 5432