Phi4-5.6B-transformers-ex1
This model is a fine-tuned version of microsoft/Phi-4-multimodal-instruct on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4529
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 4
- optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.95) and epsilon=1e-07 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 50
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.1653 | 0.0799 | 20 | 0.1542 |
0.1324 | 0.1598 | 40 | 0.1429 |
0.2598 | 0.2398 | 60 | 0.3326 |
0.1638 | 0.3197 | 80 | 0.1500 |
0.1499 | 0.3996 | 100 | 0.4031 |
0.15 | 0.4795 | 120 | 0.3213 |
0.1679 | 0.5594 | 140 | 0.1489 |
0.1431 | 0.6394 | 160 | 0.1531 |
0.1462 | 0.7193 | 180 | 0.1488 |
0.1464 | 0.7992 | 200 | 0.1485 |
0.1379 | 0.8791 | 220 | 0.1482 |
0.1414 | 0.9590 | 240 | 0.1567 |
0.1328 | 1.0360 | 260 | 0.1472 |
0.134 | 1.1159 | 280 | 0.1466 |
0.1415 | 1.1958 | 300 | 0.1447 |
0.141 | 1.2757 | 320 | 0.1470 |
0.1378 | 1.3556 | 340 | 0.1685 |
0.1425 | 1.4356 | 360 | 0.1560 |
0.1405 | 1.5155 | 380 | 0.1412 |
0.135 | 1.5954 | 400 | 0.1512 |
0.1359 | 1.6753 | 420 | 0.1410 |
0.1336 | 1.7552 | 440 | 0.1394 |
0.1317 | 1.8352 | 460 | 0.1408 |
0.1323 | 1.9151 | 480 | 0.1497 |
0.1349 | 1.9950 | 500 | 0.1387 |
0.1204 | 2.0719 | 520 | 0.1407 |
0.1286 | 2.1518 | 540 | 0.1399 |
0.1333 | 2.2318 | 560 | 0.1414 |
0.1315 | 2.3117 | 580 | 0.1398 |
0.1313 | 2.3916 | 600 | 0.1455 |
0.1308 | 2.4715 | 620 | 0.1377 |
0.1327 | 2.5514 | 640 | 0.1400 |
0.1324 | 2.6314 | 660 | 0.1370 |
0.1309 | 2.7113 | 680 | 0.1343 |
0.1274 | 2.7912 | 700 | 0.1384 |
0.1287 | 2.8711 | 720 | 0.1353 |
0.1285 | 2.9510 | 740 | 0.1341 |
0.1256 | 3.0280 | 760 | 0.1380 |
0.1256 | 3.1079 | 780 | 0.1340 |
0.1224 | 3.1878 | 800 | 0.1372 |
0.1244 | 3.2677 | 820 | 0.1358 |
0.1256 | 3.3477 | 840 | 0.1337 |
0.1229 | 3.4276 | 860 | 0.1336 |
0.1252 | 3.5075 | 880 | 0.1333 |
0.1234 | 3.5874 | 900 | 0.1360 |
0.1276 | 3.6673 | 920 | 0.1344 |
0.1258 | 3.7473 | 940 | 0.1327 |
0.1249 | 3.8272 | 960 | 0.1357 |
0.1273 | 3.9071 | 980 | 0.1346 |
0.1266 | 3.9870 | 1000 | 0.1356 |
0.1172 | 4.0639 | 1020 | 0.1413 |
0.1236 | 4.1439 | 1040 | 0.1396 |
0.1219 | 4.2238 | 1060 | 0.1368 |
0.1187 | 4.3037 | 1080 | 0.1399 |
0.1225 | 4.3836 | 1100 | 0.1387 |
0.1243 | 4.4635 | 1120 | 0.1370 |
0.1218 | 4.5435 | 1140 | 0.1360 |
0.1189 | 4.6234 | 1160 | 0.1325 |
0.1185 | 4.7033 | 1180 | 0.1373 |
0.1251 | 4.7832 | 1200 | 0.1352 |
0.1214 | 4.8631 | 1220 | 0.1333 |
0.1225 | 4.9431 | 1240 | 0.1339 |
0.1138 | 5.0200 | 1260 | 0.1348 |
0.1205 | 5.0999 | 1280 | 0.1415 |
0.1208 | 5.1798 | 1300 | 0.1434 |
0.1165 | 5.2597 | 1320 | 0.1415 |
0.1154 | 5.3397 | 1340 | 0.1392 |
0.1143 | 5.4196 | 1360 | 0.1442 |
0.1165 | 5.4995 | 1380 | 0.1397 |
0.1162 | 5.5794 | 1400 | 0.1414 |
0.1148 | 5.6593 | 1420 | 0.1389 |
0.1133 | 5.7393 | 1440 | 0.1391 |
0.1145 | 5.8192 | 1460 | 0.1393 |
0.1152 | 5.8991 | 1480 | 0.1397 |
0.113 | 5.9790 | 1500 | 0.1407 |
0.0993 | 6.0559 | 1520 | 0.1625 |
0.0962 | 6.1359 | 1540 | 0.1609 |
0.0995 | 6.2158 | 1560 | 0.1573 |
0.1028 | 6.2957 | 1580 | 0.1582 |
0.0983 | 6.3756 | 1600 | 0.1620 |
0.0989 | 6.4555 | 1620 | 0.1572 |
0.0987 | 6.5355 | 1640 | 0.1602 |
0.0992 | 6.6154 | 1660 | 0.1593 |
0.0997 | 6.6953 | 1680 | 0.1644 |
0.0967 | 6.7752 | 1700 | 0.1630 |
0.0988 | 6.8551 | 1720 | 0.1596 |
0.098 | 6.9351 | 1740 | 0.1605 |
0.0915 | 7.0120 | 1760 | 0.1662 |
0.0666 | 7.0919 | 1780 | 0.2258 |
0.0638 | 7.1718 | 1800 | 0.2135 |
0.0581 | 7.2517 | 1820 | 0.2290 |
0.065 | 7.3317 | 1840 | 0.2115 |
0.0611 | 7.4116 | 1860 | 0.2396 |
0.059 | 7.4915 | 1880 | 0.2205 |
0.0598 | 7.5714 | 1900 | 0.2314 |
0.0608 | 7.6513 | 1920 | 0.2309 |
0.063 | 7.7313 | 1940 | 0.2383 |
0.0621 | 7.8112 | 1960 | 0.2304 |
0.0586 | 7.8911 | 1980 | 0.2433 |
0.0622 | 7.9710 | 2000 | 0.2354 |
0.0369 | 8.0480 | 2020 | 0.3233 |
0.0246 | 8.1279 | 2040 | 0.3437 |
0.022 | 8.2078 | 2060 | 0.3361 |
0.0243 | 8.2877 | 2080 | 0.3413 |
0.0235 | 8.3676 | 2100 | 0.3458 |
0.0229 | 8.4476 | 2120 | 0.3473 |
0.0218 | 8.5275 | 2140 | 0.3523 |
0.0234 | 8.6074 | 2160 | 0.3610 |
0.0228 | 8.6873 | 2180 | 0.3496 |
0.0221 | 8.7672 | 2200 | 0.3519 |
0.0223 | 8.8472 | 2220 | 0.3515 |
0.0224 | 8.9271 | 2240 | 0.3514 |
0.0193 | 9.0040 | 2260 | 0.3542 |
0.0081 | 9.0839 | 2280 | 0.4155 |
0.0071 | 9.1638 | 2300 | 0.4363 |
0.0065 | 9.2438 | 2320 | 0.4446 |
0.0057 | 9.3237 | 2340 | 0.4485 |
0.0064 | 9.4036 | 2360 | 0.4495 |
0.0071 | 9.4835 | 2380 | 0.4502 |
0.0058 | 9.5634 | 2400 | 0.4518 |
0.0066 | 9.6434 | 2420 | 0.4530 |
0.0072 | 9.7233 | 2440 | 0.4535 |
0.0064 | 9.8032 | 2460 | 0.4532 |
0.0076 | 9.8831 | 2480 | 0.4533 |
0.0063 | 9.9630 | 2500 | 0.4529 |
Framework versions
- Transformers 4.48.2
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.1
- Downloads last month
- 25
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for minhtien2405/Phi4-5.6B-transformers-ex1
Base model
microsoft/Phi-4-multimodal-instruct