Files changed (1) hide show
  1. README.md +105 -91
README.md CHANGED
@@ -1,91 +1,105 @@
1
- ---
2
- library_name: peft
3
- license: other
4
- base_model: Qwen/Qwen2.5-3B
5
- tags:
6
- - generated_from_trainer
7
- datasets:
8
- - menheraorg/vericava-posts
9
- model-index:
10
- - name: model-out
11
- results: []
12
- ---
13
-
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
-
17
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
18
- <details><summary>See axolotl config</summary>
19
-
20
- axolotl version: `0.7.0`
21
- ```yaml
22
- base_model: Qwen/Qwen2.5-3B
23
-
24
- load_in_8bit: true
25
- adapter: lora
26
-
27
- lora_r: 8
28
- lora_alpha: 16
29
- lora_dropout: 0.05
30
- lora_target_modules:
31
- - q_proj
32
- - v_proj
33
-
34
- # Training settings
35
- gradient_accumulation_steps: 1
36
- micro_batch_size: 1
37
- num_epochs: 3
38
- learning_rate: 0.0005
39
- gpu_memory_limit: 7.75GiB
40
-
41
- chat_template: alpaca
42
- # Your dataset
43
- datasets:
44
- - path: vericava-alpaca.jsonl
45
- type: alpaca
46
-
47
- ```
48
-
49
- </details><br>
50
-
51
- # model-out
52
-
53
- This model is a fine-tuned version of [Qwen/Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B) on the vericava-alpaca.jsonl dataset.
54
-
55
- ## Model description
56
-
57
- More information needed
58
-
59
- ## Intended uses & limitations
60
-
61
- More information needed
62
-
63
- ## Training and evaluation data
64
-
65
- More information needed
66
-
67
- ## Training procedure
68
-
69
- ### Training hyperparameters
70
-
71
- The following hyperparameters were used during training:
72
- - learning_rate: 0.0005
73
- - train_batch_size: 1
74
- - eval_batch_size: 1
75
- - seed: 42
76
- - optimizer: Use OptimizerNames.ADAMW_HF with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
77
- - lr_scheduler_type: cosine
78
- - lr_scheduler_warmup_steps: 100
79
- - num_epochs: 3.0
80
-
81
- ### Training results
82
-
83
-
84
-
85
- ### Framework versions
86
-
87
- - PEFT 0.14.0
88
- - Transformers 4.48.3
89
- - Pytorch 2.6.0+cu124
90
- - Datasets 3.2.0
91
- - Tokenizers 0.21.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ license: other
4
+ base_model: Qwen/Qwen2.5-3B
5
+ tags:
6
+ - generated_from_trainer
7
+ datasets:
8
+ - menheraorg/vericava-posts
9
+ language:
10
+ - zho
11
+ - eng
12
+ - fra
13
+ - spa
14
+ - por
15
+ - deu
16
+ - ita
17
+ - rus
18
+ - jpn
19
+ - kor
20
+ - vie
21
+ - tha
22
+ - ara
23
+ model-index:
24
+ - name: model-out
25
+ results: []
26
+ ---
27
+
28
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
+ should probably proofread and complete it, then remove this comment. -->
30
+
31
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
32
+ <details><summary>See axolotl config</summary>
33
+
34
+ axolotl version: `0.7.0`
35
+ ```yaml
36
+ base_model: Qwen/Qwen2.5-3B
37
+
38
+ load_in_8bit: true
39
+ adapter: lora
40
+
41
+ lora_r: 8
42
+ lora_alpha: 16
43
+ lora_dropout: 0.05
44
+ lora_target_modules:
45
+ - q_proj
46
+ - v_proj
47
+
48
+ # Training settings
49
+ gradient_accumulation_steps: 1
50
+ micro_batch_size: 1
51
+ num_epochs: 3
52
+ learning_rate: 0.0005
53
+ gpu_memory_limit: 7.75GiB
54
+
55
+ chat_template: alpaca
56
+ # Your dataset
57
+ datasets:
58
+ - path: vericava-alpaca.jsonl
59
+ type: alpaca
60
+
61
+ ```
62
+
63
+ </details><br>
64
+
65
+ # model-out
66
+
67
+ This model is a fine-tuned version of [Qwen/Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B) on the vericava-alpaca.jsonl dataset.
68
+
69
+ ## Model description
70
+
71
+ More information needed
72
+
73
+ ## Intended uses & limitations
74
+
75
+ More information needed
76
+
77
+ ## Training and evaluation data
78
+
79
+ More information needed
80
+
81
+ ## Training procedure
82
+
83
+ ### Training hyperparameters
84
+
85
+ The following hyperparameters were used during training:
86
+ - learning_rate: 0.0005
87
+ - train_batch_size: 1
88
+ - eval_batch_size: 1
89
+ - seed: 42
90
+ - optimizer: Use OptimizerNames.ADAMW_HF with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
91
+ - lr_scheduler_type: cosine
92
+ - lr_scheduler_warmup_steps: 100
93
+ - num_epochs: 3.0
94
+
95
+ ### Training results
96
+
97
+
98
+
99
+ ### Framework versions
100
+
101
+ - PEFT 0.14.0
102
+ - Transformers 4.48.3
103
+ - Pytorch 2.6.0+cu124
104
+ - Datasets 3.2.0
105
+ - Tokenizers 0.21.1