File size: 14,311 Bytes
c712f0e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
---
base_model: google/gemma-7b
library_name: peft
license: gemma
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: gemma7bit-lora-sql
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# gemma7bit-lora-sql

This model is a fine-tuned version of [google/gemma-7b](https://huggingface.co/google/gemma-7b) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 40.8546

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 1
- eval_batch_size: 8
- seed: 1399
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 2
- training_steps: 500

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 47.4397       | 0.0000 | 2    | 112.0961        |
| 54.9563       | 0.0001 | 4    | 113.0320        |
| 43.0701       | 0.0001 | 6    | 105.0883        |
| 29.3374       | 0.0001 | 8    | 93.7564         |
| 24.013        | 0.0001 | 10   | 70.5026         |
| 5.7244        | 0.0002 | 12   | 70.3644         |
| 6.7112        | 0.0002 | 14   | 69.0918         |
| 5.139         | 0.0002 | 16   | 67.7594         |
| 5.658         | 0.0002 | 18   | 64.8925         |
| 3.348         | 0.0003 | 20   | 62.9086         |
| 3.0009        | 0.0003 | 22   | 54.9081         |
| 3.1078        | 0.0003 | 24   | 47.0123         |
| 2.9829        | 0.0003 | 26   | 44.8515         |
| 2.4287        | 0.0004 | 28   | 42.1563         |
| 2.1561        | 0.0004 | 30   | 39.5831         |
| 2.3805        | 0.0004 | 32   | 37.8210         |
| 4.199         | 0.0004 | 34   | 36.5321         |
| 4.2891        | 0.0005 | 36   | 35.5581         |
| 2.8376        | 0.0005 | 38   | 35.1185         |
| 2.4216        | 0.0005 | 40   | 35.1674         |
| 2.2408        | 0.0005 | 42   | 34.9562         |
| 3.4941        | 0.0006 | 44   | 35.2440         |
| 3.4866        | 0.0006 | 46   | 34.5079         |
| 2.2815        | 0.0006 | 48   | 34.1046         |
| 2.2584        | 0.0006 | 50   | 34.0249         |
| 2.7932        | 0.0007 | 52   | 34.8069         |
| 2.8995        | 0.0007 | 54   | 35.0606         |
| 3.3107        | 0.0007 | 56   | 35.8230         |
| 3.0793        | 0.0007 | 58   | 36.0362         |
| 4.5829        | 0.0008 | 60   | 34.8489         |
| 2.6841        | 0.0008 | 62   | 33.6494         |
| 3.5738        | 0.0008 | 64   | 32.4676         |
| 2.955         | 0.0008 | 66   | 31.9876         |
| 2.1847        | 0.0009 | 68   | 31.4324         |
| 3.5749        | 0.0009 | 70   | 31.4434         |
| 2.0652        | 0.0009 | 72   | 31.6449         |
| 1.9506        | 0.0009 | 74   | 31.8311         |
| 2.6852        | 0.0010 | 76   | 32.0123         |
| 1.8463        | 0.0010 | 78   | 32.2012         |
| 2.4999        | 0.0010 | 80   | 32.4074         |
| 1.7525        | 0.0010 | 82   | 32.5013         |
| 1.865         | 0.0011 | 84   | 32.7458         |
| 2.5512        | 0.0011 | 86   | 32.9542         |
| 2.041         | 0.0011 | 88   | 33.7792         |
| 3.4588        | 0.0011 | 90   | 33.5860         |
| 2.2258        | 0.0012 | 92   | 33.9242         |
| 2.1416        | 0.0012 | 94   | 34.2110         |
| 1.9904        | 0.0012 | 96   | 34.1852         |
| 1.9793        | 0.0012 | 98   | 34.1257         |
| 3.3329        | 0.0013 | 100  | 34.2512         |
| 2.6011        | 0.0013 | 102  | 34.4635         |
| 2.4212        | 0.0013 | 104  | 34.5869         |
| 1.941         | 0.0014 | 106  | 34.7022         |
| 2.4623        | 0.0014 | 108  | 34.9359         |
| 2.4267        | 0.0014 | 110  | 35.1085         |
| 1.7913        | 0.0014 | 112  | 35.1962         |
| 1.6845        | 0.0015 | 114  | 35.5859         |
| 3.0888        | 0.0015 | 116  | 35.8237         |
| 3.4959        | 0.0015 | 118  | 35.4403         |
| 2.5661        | 0.0015 | 120  | 35.3171         |
| 2.4044        | 0.0016 | 122  | 35.1409         |
| 3.1554        | 0.0016 | 124  | 35.0385         |
| 2.0637        | 0.0016 | 126  | 35.4118         |
| 5.6131        | 0.0016 | 128  | 35.2343         |
| 3.0214        | 0.0017 | 130  | 35.9148         |
| 1.771         | 0.0017 | 132  | 36.5919         |
| 2.4126        | 0.0017 | 134  | 36.8129         |
| 2.5102        | 0.0017 | 136  | 36.6166         |
| 6.5612        | 0.0018 | 138  | 36.9545         |
| 2.1154        | 0.0018 | 140  | 36.8204         |
| 2.533         | 0.0018 | 142  | 36.5374         |
| 1.7012        | 0.0018 | 144  | 36.6904         |
| 2.2287        | 0.0019 | 146  | 36.1521         |
| 4.2646        | 0.0019 | 148  | 36.1889         |
| 1.8624        | 0.0019 | 150  | 36.5876         |
| 1.9946        | 0.0019 | 152  | 36.6302         |
| 2.124         | 0.0020 | 154  | 36.6274         |
| 3.01          | 0.0020 | 156  | 36.6652         |
| 1.928         | 0.0020 | 158  | 37.0886         |
| 2.6035        | 0.0020 | 160  | 37.2648         |
| 2.2572        | 0.0021 | 162  | 37.4929         |
| 1.5284        | 0.0021 | 164  | 37.7779         |
| 1.1103        | 0.0021 | 166  | 37.9401         |
| 2.4597        | 0.0021 | 168  | 37.7270         |
| 2.4846        | 0.0022 | 170  | 37.4224         |
| 2.6234        | 0.0022 | 172  | 36.6518         |
| 2.4765        | 0.0022 | 174  | 36.2149         |
| 2.0448        | 0.0022 | 176  | 35.9293         |
| 2.2736        | 0.0023 | 178  | 35.5881         |
| 2.7181        | 0.0023 | 180  | 35.3821         |
| 1.9195        | 0.0023 | 182  | 35.2214         |
| 2.9274        | 0.0023 | 184  | 35.0837         |
| 3.191         | 0.0024 | 186  | 35.1131         |
| 2.6804        | 0.0024 | 188  | 35.1649         |
| 1.5547        | 0.0024 | 190  | 35.3133         |
| 2.2601        | 0.0024 | 192  | 35.6737         |
| 2.5229        | 0.0025 | 194  | 36.1338         |
| 2.6806        | 0.0025 | 196  | 36.2942         |
| 2.2258        | 0.0025 | 198  | 36.4748         |
| 1.2856        | 0.0025 | 200  | 36.9566         |
| 2.1439        | 0.0026 | 202  | 37.1834         |
| 4.0704        | 0.0026 | 204  | 37.5976         |
| 2.5138        | 0.0026 | 206  | 38.2877         |
| 2.9025        | 0.0027 | 208  | 38.5739         |
| 1.8761        | 0.0027 | 210  | 38.3348         |
| 1.9228        | 0.0027 | 212  | 38.3183         |
| 1.7924        | 0.0027 | 214  | 38.2928         |
| 2.7619        | 0.0028 | 216  | 38.1185         |
| 2.1031        | 0.0028 | 218  | 37.7249         |
| 2.6893        | 0.0028 | 220  | 37.7826         |
| 2.255         | 0.0028 | 222  | 37.7949         |
| 2.754         | 0.0029 | 224  | 37.8576         |
| 1.6294        | 0.0029 | 226  | 38.2263         |
| 1.8586        | 0.0029 | 228  | 38.4837         |
| 2.4252        | 0.0029 | 230  | 38.7646         |
| 2.36          | 0.0030 | 232  | 38.9834         |
| 1.4407        | 0.0030 | 234  | 39.1561         |
| 1.6109        | 0.0030 | 236  | 39.3041         |
| 2.2582        | 0.0030 | 238  | 39.3389         |
| 2.8185        | 0.0031 | 240  | 39.5245         |
| 1.6233        | 0.0031 | 242  | 39.3154         |
| 2.4039        | 0.0031 | 244  | 39.0988         |
| 1.7734        | 0.0031 | 246  | 39.0567         |
| 1.4779        | 0.0032 | 248  | 39.0881         |
| 2.7848        | 0.0032 | 250  | 38.9895         |
| 2.2963        | 0.0032 | 252  | 39.2507         |
| 2.0605        | 0.0032 | 254  | 39.3339         |
| 3.3667        | 0.0033 | 256  | 39.5060         |
| 2.9702        | 0.0033 | 258  | 39.5491         |
| 2.6734        | 0.0033 | 260  | 39.7907         |
| 2.4727        | 0.0033 | 262  | 40.1472         |
| 2.7539        | 0.0034 | 264  | 40.4749         |
| 1.601         | 0.0034 | 266  | 40.3649         |
| 2.1531        | 0.0034 | 268  | 40.2932         |
| 1.8656        | 0.0034 | 270  | 40.2728         |
| 1.9617        | 0.0035 | 272  | 40.3498         |
| 1.8911        | 0.0035 | 274  | 40.3157         |
| 2.3878        | 0.0035 | 276  | 40.2882         |
| 2.677         | 0.0035 | 278  | 40.4437         |
| 2.8035        | 0.0036 | 280  | 40.2423         |
| 1.7537        | 0.0036 | 282  | 40.0182         |
| 1.5873        | 0.0036 | 284  | 39.8449         |
| 1.7802        | 0.0036 | 286  | 39.7251         |
| 2.1861        | 0.0037 | 288  | 39.3972         |
| 1.9197        | 0.0037 | 290  | 39.4064         |
| 2.6752        | 0.0037 | 292  | 39.4320         |
| 1.7225        | 0.0037 | 294  | 39.4498         |
| 1.7274        | 0.0038 | 296  | 39.4309         |
| 3.9891        | 0.0038 | 298  | 40.1752         |
| 2.5153        | 0.0038 | 300  | 40.9025         |
| 2.0587        | 0.0038 | 302  | 41.4380         |
| 2.3115        | 0.0039 | 304  | 41.9152         |
| 1.8684        | 0.0039 | 306  | 42.4118         |
| 2.0388        | 0.0039 | 308  | 42.8904         |
| 2.9396        | 0.0040 | 310  | 43.0102         |
| 1.5832        | 0.0040 | 312  | 43.0678         |
| 1.897         | 0.0040 | 314  | 43.0292         |
| 2.2008        | 0.0040 | 316  | 43.0302         |
| 2.4185        | 0.0041 | 318  | 42.8252         |
| 1.9265        | 0.0041 | 320  | 42.5088         |
| 2.5759        | 0.0041 | 322  | 42.2636         |
| 2.9898        | 0.0041 | 324  | 42.1571         |
| 1.7106        | 0.0042 | 326  | 41.7366         |
| 2.3907        | 0.0042 | 328  | 41.3667         |
| 2.4861        | 0.0042 | 330  | 41.3056         |
| 1.6998        | 0.0042 | 332  | 41.2167         |
| 2.6034        | 0.0043 | 334  | 41.2615         |
| 1.6455        | 0.0043 | 336  | 41.2327         |
| 1.8484        | 0.0043 | 338  | 41.2317         |
| 2.2123        | 0.0043 | 340  | 41.2374         |
| 1.8939        | 0.0044 | 342  | 41.1753         |
| 1.881         | 0.0044 | 344  | 41.1000         |
| 1.5313        | 0.0044 | 346  | 40.9959         |
| 2.3099        | 0.0044 | 348  | 40.9817         |
| 2.2593        | 0.0045 | 350  | 40.9572         |
| 2.2597        | 0.0045 | 352  | 40.9278         |
| 2.1038        | 0.0045 | 354  | 40.8672         |
| 1.6107        | 0.0045 | 356  | 40.6815         |
| 2.0831        | 0.0046 | 358  | 40.5641         |
| 2.2921        | 0.0046 | 360  | 40.5117         |
| 2.3178        | 0.0046 | 362  | 40.5802         |
| 1.6295        | 0.0046 | 364  | 40.4780         |
| 2.038         | 0.0047 | 366  | 40.5544         |
| 1.7012        | 0.0047 | 368  | 40.7328         |
| 2.5292        | 0.0047 | 370  | 40.8337         |
| 1.8677        | 0.0047 | 372  | 40.9356         |
| 1.5897        | 0.0048 | 374  | 41.0250         |
| 1.5096        | 0.0048 | 376  | 41.0558         |
| 1.6413        | 0.0048 | 378  | 41.2060         |
| 1.6334        | 0.0048 | 380  | 41.2175         |
| 2.0367        | 0.0049 | 382  | 41.3215         |
| 1.9155        | 0.0049 | 384  | 41.4322         |
| 1.9553        | 0.0049 | 386  | 41.4096         |
| 2.3982        | 0.0049 | 388  | 41.3870         |
| 2.1094        | 0.0050 | 390  | 41.2572         |
| 1.9943        | 0.0050 | 392  | 41.1927         |
| 2.1017        | 0.0050 | 394  | 41.1805         |
| 1.8297        | 0.0050 | 396  | 41.0817         |
| 2.2271        | 0.0051 | 398  | 41.0460         |
| 2.022         | 0.0051 | 400  | 41.0754         |
| 1.8099        | 0.0051 | 402  | 41.0777         |
| 2.0973        | 0.0051 | 404  | 41.1348         |
| 2.03          | 0.0052 | 406  | 41.1109         |
| 1.7342        | 0.0052 | 408  | 41.1719         |
| 2.0422        | 0.0052 | 410  | 41.1616         |
| 2.6192        | 0.0052 | 412  | 41.0411         |
| 1.7107        | 0.0053 | 414  | 41.0704         |
| 2.8018        | 0.0053 | 416  | 41.0641         |
| 1.3767        | 0.0053 | 418  | 41.0719         |
| 1.9952        | 0.0054 | 420  | 41.0151         |
| 1.7584        | 0.0054 | 422  | 40.9978         |
| 2.1318        | 0.0054 | 424  | 40.9933         |
| 2.3412        | 0.0054 | 426  | 40.9837         |
| 1.6604        | 0.0055 | 428  | 41.0310         |
| 1.6301        | 0.0055 | 430  | 40.9782         |
| 2.0232        | 0.0055 | 432  | 40.9377         |
| 1.7096        | 0.0055 | 434  | 40.9645         |
| 2.1696        | 0.0056 | 436  | 40.9631         |
| 1.5297        | 0.0056 | 438  | 40.9690         |
| 1.4017        | 0.0056 | 440  | 41.0132         |
| 1.7817        | 0.0056 | 442  | 40.9486         |
| 1.7264        | 0.0057 | 444  | 40.9499         |
| 1.8601        | 0.0057 | 446  | 41.0064         |
| 1.9614        | 0.0057 | 448  | 41.0266         |
| 2.3045        | 0.0057 | 450  | 41.0035         |
| 2.67          | 0.0058 | 452  | 41.0159         |
| 1.5752        | 0.0058 | 454  | 40.9748         |
| 1.7464        | 0.0058 | 456  | 40.9395         |
| 1.9167        | 0.0058 | 458  | 40.9119         |
| 1.8777        | 0.0059 | 460  | 40.9021         |
| 1.5879        | 0.0059 | 462  | 40.9164         |
| 1.942         | 0.0059 | 464  | 40.8847         |
| 1.6303        | 0.0059 | 466  | 40.9104         |
| 2.1252        | 0.0060 | 468  | 40.9000         |
| 2.2879        | 0.0060 | 470  | 40.9209         |
| 1.7646        | 0.0060 | 472  | 40.8601         |
| 2.3169        | 0.0060 | 474  | 40.8726         |
| 1.7797        | 0.0061 | 476  | 40.8563         |
| 2.0428        | 0.0061 | 478  | 40.8609         |
| 2.4124        | 0.0061 | 480  | 40.8663         |
| 2.2955        | 0.0061 | 482  | 40.8601         |
| 1.3035        | 0.0062 | 484  | 40.8517         |
| 2.611         | 0.0062 | 486  | 40.8781         |
| 2.0677        | 0.0062 | 488  | 40.8694         |
| 2.1645        | 0.0062 | 490  | 40.8864         |
| 2.0708        | 0.0063 | 492  | 40.8633         |
| 1.663         | 0.0063 | 494  | 40.8689         |
| 1.9784        | 0.0063 | 496  | 40.8672         |
| 1.7215        | 0.0063 | 498  | 40.8439         |
| 2.2366        | 0.0064 | 500  | 40.8546         |


### Framework versions

- PEFT 0.12.0
- Transformers 4.44.0
- Pytorch 2.2.2+cu121
- Datasets 2.21.0
- Tokenizers 0.19.1