retinol commited on
Commit
671f390
·
verified ·
1 Parent(s): 7216ca0

retinol/eva_chatbot

Browse files
README.md CHANGED
@@ -18,18 +18,18 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 2.0725
22
- - Rewards/chosen: -0.1902
23
- - Rewards/rejected: -0.5952
24
- - Rewards/accuracies: 1.0
25
- - Rewards/margins: 0.4050
26
- - Logps/rejected: -5.9518
27
- - Logps/chosen: -1.9017
28
- - Logits/rejected: -1.4323
29
- - Logits/chosen: -1.4883
30
- - Nll Loss: 2.0422
31
- - Log Odds Ratio: -0.0333
32
- - Log Odds Chosen: 4.2383
33
 
34
  ## Model description
35
 
@@ -48,53 +48,53 @@ More information needed
48
  ### Training hyperparameters
49
 
50
  The following hyperparameters were used during training:
51
- - learning_rate: 1e-05
52
  - train_batch_size: 4
53
  - eval_batch_size: 4
54
  - seed: 42
55
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
  - lr_scheduler_type: linear
57
- - lr_scheduler_warmup_steps: 10
58
  - num_epochs: 10
59
 
60
  ### Training results
61
 
62
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
63
  |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
64
- | 2.3124 | 0.2907 | 50 | 2.2663 | -0.1983 | -0.2703 | 0.8500 | 0.0720 | -2.7034 | -1.9832 | -1.6636 | -1.8317 | 2.1942 | -0.4323 | 0.7993 |
65
- | 1.9504 | 0.5814 | 100 | 1.9383 | -0.1708 | -0.2955 | 0.9750 | 0.1247 | -2.9545 | -1.7079 | -1.5422 | -1.7126 | 1.8856 | -0.2704 | 1.3977 |
66
- | 1.8833 | 0.8721 | 150 | 1.8364 | -0.1607 | -0.3173 | 1.0 | 0.1567 | -3.1734 | -1.6065 | -1.5225 | -1.7038 | 1.7929 | -0.2020 | 1.7543 |
67
- | 1.8304 | 1.1628 | 200 | 1.7915 | -0.1564 | -0.3460 | 1.0 | 0.1897 | -3.4604 | -1.5637 | -1.4669 | -1.6282 | 1.7542 | -0.1538 | 2.1088 |
68
- | 1.743 | 1.4535 | 250 | 1.7650 | -0.1538 | -0.3608 | 1.0 | 0.2070 | -3.6080 | -1.5377 | -1.4113 | -1.5466 | 1.7299 | -0.1313 | 2.2959 |
69
- | 1.6422 | 1.7442 | 300 | 1.7436 | -0.1520 | -0.3730 | 1.0 | 0.2210 | -3.7301 | -1.5203 | -1.4048 | -1.5303 | 1.7095 | -0.1166 | 2.4441 |
70
- | 1.5855 | 2.0349 | 350 | 1.7229 | -0.1506 | -0.3913 | 1.0 | 0.2407 | -3.9129 | -1.5063 | -1.4023 | -1.5265 | 1.6905 | -0.0985 | 2.6505 |
71
- | 1.5748 | 2.3256 | 400 | 1.7099 | -0.1493 | -0.4025 | 1.0 | 0.2532 | -4.0251 | -1.4933 | -1.3636 | -1.4729 | 1.6792 | -0.0884 | 2.7823 |
72
- | 1.5417 | 2.6163 | 450 | 1.7075 | -0.1501 | -0.4202 | 1.0 | 0.2701 | -4.2022 | -1.5009 | -1.3669 | -1.4717 | 1.6771 | -0.0775 | 2.9538 |
73
- | 1.5158 | 2.9070 | 500 | 1.6933 | -0.1479 | -0.4170 | 1.0 | 0.2691 | -4.1697 | -1.4789 | -1.3712 | -1.4670 | 1.6635 | -0.0792 | 2.9477 |
74
- | 1.5251 | 3.1977 | 550 | 1.6969 | -0.1490 | -0.4329 | 1.0 | 0.2840 | -4.3294 | -1.4899 | -1.3649 | -1.4573 | 1.6683 | -0.0709 | 3.0973 |
75
- | 1.4585 | 3.4884 | 600 | 1.6985 | -0.1499 | -0.4445 | 1.0 | 0.2946 | -4.4454 | -1.4993 | -1.3604 | -1.4385 | 1.6704 | -0.0644 | 3.2043 |
76
- | 1.4254 | 3.7791 | 650 | 1.7069 | -0.1502 | -0.4481 | 1.0 | 0.2979 | -4.4811 | -1.5024 | -1.3604 | -1.4329 | 1.6786 | -0.0643 | 3.2363 |
77
- | 1.2609 | 4.0698 | 700 | 1.7151 | -0.1512 | -0.4698 | 1.0 | 0.3186 | -4.6985 | -1.5125 | -1.3449 | -1.3977 | 1.6867 | -0.0545 | 3.4459 |
78
- | 1.3129 | 4.3605 | 750 | 1.7098 | -0.1505 | -0.4701 | 1.0 | 0.3196 | -4.7009 | -1.5053 | -1.3549 | -1.4164 | 1.6819 | -0.0545 | 3.4573 |
79
- | 1.2809 | 4.6512 | 800 | 1.7384 | -0.1541 | -0.4886 | 1.0 | 0.3345 | -4.8862 | -1.5408 | -1.3756 | -1.4381 | 1.7109 | -0.0491 | 3.5996 |
80
- | 1.314 | 4.9419 | 850 | 1.7123 | -0.1520 | -0.4872 | 1.0 | 0.3352 | -4.8724 | -1.5202 | -1.3758 | -1.4322 | 1.6863 | -0.0479 | 3.6111 |
81
- | 1.1524 | 5.2326 | 900 | 1.7714 | -0.1582 | -0.5048 | 1.0 | 0.3466 | -5.0482 | -1.5822 | -1.3822 | -1.4473 | 1.7448 | -0.0459 | 3.7110 |
82
- | 1.1389 | 5.5233 | 950 | 1.7798 | -0.1591 | -0.5181 | 1.0 | 0.3590 | -5.1811 | -1.5909 | -1.3829 | -1.4497 | 1.7535 | -0.0416 | 3.8351 |
83
- | 1.2328 | 5.8140 | 1000 | 1.7625 | -0.1569 | -0.5122 | 1.0 | 0.3553 | -5.1217 | -1.5688 | -1.3805 | -1.4356 | 1.7363 | -0.0421 | 3.8023 |
84
- | 1.0486 | 6.1047 | 1050 | 1.8676 | -0.1681 | -0.5426 | 1.0 | 0.3745 | -5.4261 | -1.6814 | -1.3933 | -1.4544 | 1.8395 | -0.0389 | 3.9702 |
85
- | 1.0584 | 6.3953 | 1100 | 1.8792 | -0.1696 | -0.5453 | 1.0 | 0.3757 | -5.4527 | -1.6956 | -1.3983 | -1.4583 | 1.8516 | -0.0385 | 3.9793 |
86
- | 1.0821 | 6.6860 | 1150 | 1.8713 | -0.1695 | -0.5438 | 1.0 | 0.3743 | -5.4377 | -1.6947 | -1.3967 | -1.4577 | 1.8430 | -0.0395 | 3.9668 |
87
- | 1.0626 | 6.9767 | 1200 | 1.8489 | -0.1664 | -0.5382 | 1.0 | 0.3718 | -5.3818 | -1.6641 | -1.4108 | -1.4717 | 1.8216 | -0.0392 | 3.9481 |
88
- | 0.99 | 7.2674 | 1250 | 1.9337 | -0.1750 | -0.5580 | 1.0 | 0.3830 | -5.5805 | -1.7503 | -1.4138 | -1.4770 | 1.9057 | -0.0374 | 4.0435 |
89
- | 0.9057 | 7.5581 | 1300 | 1.9254 | -0.1744 | -0.5642 | 1.0 | 0.3898 | -5.6424 | -1.7443 | -1.4118 | -1.4734 | 1.8966 | -0.0352 | 4.1143 |
90
- | 0.9604 | 7.8488 | 1350 | 1.9253 | -0.1747 | -0.5618 | 1.0 | 0.3871 | -5.6180 | -1.7475 | -1.4168 | -1.4764 | 1.8969 | -0.0358 | 4.0853 |
91
- | 0.9344 | 8.1395 | 1400 | 1.9872 | -0.1820 | -0.5761 | 1.0 | 0.3941 | -5.7611 | -1.8198 | -1.4189 | -1.4773 | 1.9583 | -0.0353 | 4.1425 |
92
- | 0.9001 | 8.4302 | 1450 | 2.0156 | -0.1851 | -0.5843 | 1.0 | 0.3992 | -5.8432 | -1.8513 | -1.4197 | -1.4804 | 1.9864 | -0.0340 | 4.1883 |
93
- | 0.8909 | 8.7209 | 1500 | 1.9973 | -0.1820 | -0.5802 | 1.0 | 0.3982 | -5.8024 | -1.8201 | -1.4219 | -1.4762 | 1.9683 | -0.0340 | 4.1844 |
94
- | 0.8509 | 9.0116 | 1550 | 2.0277 | -0.1851 | -0.5851 | 1.0 | 0.4000 | -5.8514 | -1.8514 | -1.4280 | -1.4844 | 1.9981 | -0.0341 | 4.1966 |
95
- | 0.7776 | 9.3023 | 1600 | 2.0734 | -0.1902 | -0.5948 | 1.0 | 0.4046 | -5.9477 | -1.9019 | -1.4303 | -1.4866 | 2.0434 | -0.0335 | 4.2332 |
96
- | 0.8068 | 9.5930 | 1650 | 2.0621 | -0.1891 | -0.5935 | 1.0 | 0.4044 | -5.9349 | -1.8911 | -1.4306 | -1.4860 | 2.0323 | -0.0332 | 4.2336 |
97
- | 0.847 | 9.8837 | 1700 | 2.0725 | -0.1902 | -0.5952 | 1.0 | 0.4050 | -5.9518 | -1.9017 | -1.4323 | -1.4883 | 2.0422 | -0.0333 | 4.2383 |
98
 
99
 
100
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 2.2478
22
+ - Rewards/chosen: -0.2025
23
+ - Rewards/rejected: -0.2832
24
+ - Rewards/accuracies: 0.8875
25
+ - Rewards/margins: 0.0807
26
+ - Logps/rejected: -2.8322
27
+ - Logps/chosen: -2.0250
28
+ - Logits/rejected: -2.1129
29
+ - Logits/chosen: -1.7345
30
+ - Nll Loss: 2.2268
31
+ - Log Odds Ratio: -0.3839
32
+ - Log Odds Chosen: 0.8882
33
 
34
  ## Model description
35
 
 
48
  ### Training hyperparameters
49
 
50
  The following hyperparameters were used during training:
51
+ - learning_rate: 1e-06
52
  - train_batch_size: 4
53
  - eval_batch_size: 4
54
  - seed: 42
55
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
  - lr_scheduler_type: linear
57
+ - lr_scheduler_warmup_steps: 50
58
  - num_epochs: 10
59
 
60
  ### Training results
61
 
62
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
63
  |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
64
+ | 5.5008 | 0.2907 | 50 | 5.6262 | -0.5231 | -0.6023 | 0.8250 | 0.0792 | -6.0233 | -5.2314 | -2.0311 | -1.8904 | 5.5816 | -0.4363 | 0.7951 |
65
+ | 4.92 | 0.5814 | 100 | 5.1023 | -0.4828 | -0.5584 | 0.8250 | 0.0756 | -5.5836 | -4.8278 | -2.1181 | -2.0055 | 5.0596 | -0.4441 | 0.7604 |
66
+ | 4.6969 | 0.8721 | 150 | 4.6774 | -0.4489 | -0.5171 | 0.8500 | 0.0682 | -5.1705 | -4.4885 | -2.1660 | -2.0410 | 4.6355 | -0.4630 | 0.6879 |
67
+ | 3.9492 | 1.1628 | 200 | 3.8213 | -0.3674 | -0.4438 | 0.875 | 0.0765 | -4.4384 | -3.6736 | -2.2855 | -1.9961 | 3.8167 | -0.4302 | 0.7799 |
68
+ | 3.45 | 1.4535 | 250 | 3.4864 | -0.3342 | -0.4227 | 0.9125 | 0.0885 | -4.2266 | -3.3420 | -2.2557 | -1.8804 | 3.4837 | -0.3910 | 0.9067 |
69
+ | 3.2561 | 1.7442 | 300 | 3.2679 | -0.3119 | -0.3956 | 0.9000 | 0.0837 | -3.9559 | -3.1191 | -2.2849 | -1.9045 | 3.2595 | -0.4022 | 0.8630 |
70
+ | 3.0471 | 2.0349 | 350 | 3.1300 | -0.3005 | -0.3768 | 0.9000 | 0.0763 | -3.7679 | -3.0046 | -2.2584 | -1.8626 | 3.1220 | -0.4214 | 0.7911 |
71
+ | 2.9312 | 2.3256 | 400 | 2.9729 | -0.2816 | -0.3469 | 0.875 | 0.0653 | -3.4686 | -2.8161 | -2.2750 | -1.8891 | 2.9539 | -0.4551 | 0.6823 |
72
+ | 2.6856 | 2.6163 | 450 | 2.8281 | -0.2630 | -0.3133 | 0.8375 | 0.0503 | -3.1333 | -2.6298 | -2.2692 | -1.8896 | 2.8010 | -0.5058 | 0.5330 |
73
+ | 2.7304 | 2.9070 | 500 | 2.7191 | -0.2493 | -0.2893 | 0.7875 | 0.0400 | -2.8928 | -2.4927 | -2.2573 | -1.8775 | 2.6907 | -0.5448 | 0.4286 |
74
+ | 2.6224 | 3.1977 | 550 | 2.6362 | -0.2406 | -0.2809 | 0.7750 | 0.0403 | -2.8089 | -2.4062 | -2.2342 | -1.8500 | 2.6066 | -0.5412 | 0.4341 |
75
+ | 2.5026 | 3.4884 | 600 | 2.5858 | -0.2354 | -0.2761 | 0.7750 | 0.0407 | -2.7606 | -2.3537 | -2.2217 | -1.8389 | 2.5555 | -0.5383 | 0.4406 |
76
+ | 2.6062 | 3.7791 | 650 | 2.5413 | -0.2315 | -0.2783 | 0.7875 | 0.0468 | -2.7833 | -2.3151 | -2.2000 | -1.8150 | 2.5111 | -0.5115 | 0.5079 |
77
+ | 2.3809 | 4.0698 | 700 | 2.4987 | -0.2264 | -0.2712 | 0.8000 | 0.0448 | -2.7123 | -2.2642 | -2.1931 | -1.8048 | 2.4689 | -0.5187 | 0.4884 |
78
+ | 2.4307 | 4.3605 | 750 | 2.4637 | -0.2232 | -0.2721 | 0.8000 | 0.0489 | -2.7213 | -2.2323 | -2.1814 | -1.7947 | 2.4350 | -0.5014 | 0.5339 |
79
+ | 2.4116 | 4.6512 | 800 | 2.4364 | -0.2203 | -0.2709 | 0.8000 | 0.0506 | -2.7095 | -2.2034 | -2.1728 | -1.7871 | 2.4081 | -0.4942 | 0.5536 |
80
+ | 2.3713 | 4.9419 | 850 | 2.4145 | -0.2180 | -0.2716 | 0.8125 | 0.0535 | -2.7157 | -2.1803 | -2.1681 | -1.7788 | 2.3873 | -0.4823 | 0.5863 |
81
+ | 2.3885 | 5.2326 | 900 | 2.3904 | -0.2160 | -0.2735 | 0.8250 | 0.0575 | -2.7352 | -2.1603 | -2.1621 | -1.7749 | 2.3630 | -0.4664 | 0.6301 |
82
+ | 2.3782 | 5.5233 | 950 | 2.3710 | -0.2141 | -0.2735 | 0.8250 | 0.0595 | -2.7355 | -2.1408 | -2.1522 | -1.7627 | 2.3448 | -0.4588 | 0.6524 |
83
+ | 2.2396 | 5.8140 | 1000 | 2.3565 | -0.2130 | -0.2767 | 0.8500 | 0.0637 | -2.7666 | -2.1295 | -2.1432 | -1.7523 | 2.3312 | -0.4429 | 0.6988 |
84
+ | 2.2947 | 6.1047 | 1050 | 2.3363 | -0.2109 | -0.2761 | 0.8625 | 0.0652 | -2.7607 | -2.1086 | -2.1430 | -1.7592 | 2.3118 | -0.4374 | 0.7162 |
85
+ | 2.2506 | 6.3953 | 1100 | 2.3212 | -0.2094 | -0.2765 | 0.8625 | 0.0671 | -2.7653 | -2.0941 | -2.1394 | -1.7585 | 2.2969 | -0.4304 | 0.7376 |
86
+ | 2.2421 | 6.6860 | 1150 | 2.3090 | -0.2084 | -0.2781 | 0.8625 | 0.0697 | -2.7808 | -2.0840 | -2.1324 | -1.7495 | 2.2853 | -0.4213 | 0.7657 |
87
+ | 2.2733 | 6.9767 | 1200 | 2.2972 | -0.2072 | -0.2788 | 0.875 | 0.0715 | -2.7878 | -2.0724 | -2.1276 | -1.7452 | 2.2739 | -0.4147 | 0.7865 |
88
+ | 2.269 | 7.2674 | 1250 | 2.2879 | -0.2064 | -0.2803 | 0.875 | 0.0738 | -2.8025 | -2.0641 | -2.1251 | -1.7449 | 2.2651 | -0.4067 | 0.8118 |
89
+ | 2.1922 | 7.5581 | 1300 | 2.2843 | -0.2056 | -0.2779 | 0.875 | 0.0723 | -2.7791 | -2.0565 | -2.1274 | -1.7480 | 2.2614 | -0.4121 | 0.7953 |
90
+ | 2.1969 | 7.8488 | 1350 | 2.2745 | -0.2050 | -0.2797 | 0.875 | 0.0748 | -2.7975 | -2.0497 | -2.1249 | -1.7453 | 2.2520 | -0.4034 | 0.8228 |
91
+ | 2.1968 | 8.1395 | 1400 | 2.2674 | -0.2043 | -0.2805 | 0.875 | 0.0762 | -2.8054 | -2.0433 | -2.1219 | -1.7424 | 2.2452 | -0.3987 | 0.8385 |
92
+ | 2.2984 | 8.4302 | 1450 | 2.2618 | -0.2038 | -0.2810 | 0.8875 | 0.0772 | -2.8104 | -2.0379 | -2.1210 | -1.7416 | 2.2398 | -0.3952 | 0.8501 |
93
+ | 2.2809 | 8.7209 | 1500 | 2.2636 | -0.2041 | -0.2852 | 0.9125 | 0.0811 | -2.8523 | -2.0408 | -2.1185 | -1.7341 | 2.2419 | -0.3823 | 0.8918 |
94
+ | 2.2605 | 9.0116 | 1550 | 2.2537 | -0.2032 | -0.2833 | 0.9000 | 0.0801 | -2.8331 | -2.0316 | -2.1153 | -1.7363 | 2.2324 | -0.3857 | 0.8816 |
95
+ | 2.1305 | 9.3023 | 1600 | 2.2505 | -0.2028 | -0.2832 | 0.9000 | 0.0804 | -2.8322 | -2.0279 | -2.1129 | -1.7336 | 2.2294 | -0.3849 | 0.8848 |
96
+ | 2.1614 | 9.5930 | 1650 | 2.2487 | -0.2026 | -0.2833 | 0.9000 | 0.0807 | -2.8330 | -2.0261 | -2.1129 | -1.7343 | 2.2276 | -0.3841 | 0.8878 |
97
+ | 2.1278 | 9.8837 | 1700 | 2.2478 | -0.2025 | -0.2832 | 0.8875 | 0.0807 | -2.8322 | -2.0250 | -2.1129 | -1.7345 | 2.2268 | -0.3839 | 0.8882 |
98
 
99
 
100
  ### Framework versions
adapter_config.json CHANGED
@@ -20,13 +20,13 @@
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
- "down_proj",
24
  "o_proj",
25
  "gate_proj",
26
- "k_proj",
27
- "q_proj",
28
  "v_proj",
29
- "up_proj"
 
 
30
  ],
31
  "task_type": "CAUSAL_LM",
32
  "use_dora": false,
 
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
 
23
  "o_proj",
24
  "gate_proj",
25
+ "down_proj",
 
26
  "v_proj",
27
+ "up_proj",
28
+ "k_proj",
29
+ "q_proj"
30
  ],
31
  "task_type": "CAUSAL_LM",
32
  "use_dora": false,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8443b3676dce0ed368a1bc690e1cdece111e6fbb203367507a6c39f640165e4e
3
  size 4370592096
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dabc021e1c33b21bc7a733c4df4db7018143994da14c7b282604432fc6d0de30
3
  size 4370592096
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:602224a6e20580387fa40b8cc32c267efcde16427ba617474a0052e0b25aae7a
3
  size 5368
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:78e01d17c3fcac1d938e47d41fc9e34c6e736e6f6ec429450337fc6d9add6f12
3
  size 5368