unsloth-test
This model is a fine-tuned version of ybelkada/falcon-7b-sharded-bf16 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.1783
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 8
- seed: 3407
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 10
- training_steps: 1000
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.9064 | 0.03 | 10 | 2.3285 |
1.7147 | 0.05 | 20 | 2.1141 |
1.9815 | 0.08 | 30 | 1.9912 |
1.6473 | 0.11 | 40 | 1.9045 |
1.8572 | 0.14 | 50 | 1.8468 |
1.4015 | 0.16 | 60 | 1.7899 |
1.3057 | 0.19 | 70 | 1.7523 |
1.4202 | 0.22 | 80 | 1.7084 |
1.2665 | 0.24 | 90 | 1.6817 |
1.1922 | 0.27 | 100 | 1.6452 |
1.0946 | 0.3 | 110 | 1.6227 |
1.1268 | 0.33 | 120 | 1.6033 |
1.4784 | 0.35 | 130 | 1.5873 |
1.2049 | 0.38 | 140 | 1.5733 |
1.1141 | 0.41 | 150 | 1.5400 |
1.2119 | 0.43 | 160 | 1.5317 |
1.5383 | 0.46 | 170 | 1.5195 |
1.308 | 0.49 | 180 | 1.5100 |
1.6075 | 0.52 | 190 | 1.4901 |
1.4443 | 0.54 | 200 | 1.4942 |
1.4493 | 0.57 | 210 | 1.4685 |
0.8844 | 0.6 | 220 | 1.4677 |
1.2082 | 0.62 | 230 | 1.4518 |
1.1994 | 0.65 | 240 | 1.4380 |
0.7472 | 0.68 | 250 | 1.4300 |
1.0516 | 0.71 | 260 | 1.4183 |
0.9373 | 0.73 | 270 | 1.4178 |
1.1221 | 0.76 | 280 | 1.4010 |
1.0978 | 0.79 | 290 | 1.3931 |
1.1696 | 0.82 | 300 | 1.3853 |
0.8722 | 0.84 | 310 | 1.3673 |
0.8707 | 0.87 | 320 | 1.3583 |
0.9933 | 0.9 | 330 | 1.3480 |
1.0988 | 0.92 | 340 | 1.3450 |
0.9995 | 0.95 | 350 | 1.3399 |
0.9277 | 0.98 | 360 | 1.3414 |
0.7538 | 1.01 | 370 | 1.3260 |
0.7933 | 1.03 | 380 | 1.3152 |
0.9362 | 1.06 | 390 | 1.3101 |
0.8345 | 1.09 | 400 | 1.3023 |
0.9271 | 1.11 | 410 | 1.3044 |
1.0953 | 1.14 | 420 | 1.3081 |
1.0863 | 1.17 | 430 | 1.2927 |
0.8999 | 1.2 | 440 | 1.2951 |
1.0146 | 1.22 | 450 | 1.2842 |
0.6808 | 1.25 | 460 | 1.2784 |
0.8005 | 1.28 | 470 | 1.2713 |
0.746 | 1.3 | 480 | 1.2642 |
0.9976 | 1.33 | 490 | 1.2661 |
0.7508 | 1.36 | 500 | 1.2539 |
0.8439 | 1.39 | 510 | 1.2462 |
0.7671 | 1.41 | 520 | 1.2465 |
0.7547 | 1.44 | 530 | 1.2421 |
0.6882 | 1.47 | 540 | 1.2426 |
1.0217 | 1.49 | 550 | 1.2381 |
0.6518 | 1.52 | 560 | 1.2324 |
0.9974 | 1.55 | 570 | 1.2324 |
0.7781 | 1.58 | 580 | 1.2292 |
0.7264 | 1.6 | 590 | 1.2296 |
0.9346 | 1.63 | 600 | 1.2255 |
0.7346 | 1.66 | 610 | 1.2223 |
0.9378 | 1.68 | 620 | 1.2164 |
0.8924 | 1.71 | 630 | 1.2176 |
1.2544 | 1.74 | 640 | 1.2157 |
0.7287 | 1.77 | 650 | 1.2155 |
0.7783 | 1.79 | 660 | 1.2069 |
0.8022 | 1.82 | 670 | 1.2049 |
0.9351 | 1.85 | 680 | 1.1969 |
0.891 | 1.88 | 690 | 1.2017 |
0.9973 | 1.9 | 700 | 1.2013 |
0.7192 | 1.93 | 710 | 1.2034 |
0.6488 | 1.96 | 720 | 1.1970 |
0.6643 | 1.98 | 730 | 1.1986 |
0.9502 | 2.01 | 740 | 1.1922 |
0.7626 | 2.04 | 750 | 1.1951 |
0.7211 | 2.07 | 760 | 1.1904 |
0.6385 | 2.09 | 770 | 1.1911 |
0.7917 | 2.12 | 780 | 1.1898 |
0.7153 | 2.15 | 790 | 1.1914 |
0.808 | 2.17 | 800 | 1.1869 |
0.6567 | 2.2 | 810 | 1.1870 |
0.5624 | 2.23 | 820 | 1.1909 |
0.7756 | 2.26 | 830 | 1.1869 |
0.8657 | 2.28 | 840 | 1.1864 |
1.0908 | 2.31 | 850 | 1.1866 |
0.9891 | 2.34 | 860 | 1.1835 |
0.7495 | 2.36 | 870 | 1.1784 |
0.7918 | 2.39 | 880 | 1.1809 |
0.8408 | 2.42 | 890 | 1.1808 |
0.9415 | 2.45 | 900 | 1.1807 |
0.8744 | 2.47 | 910 | 1.1793 |
0.6232 | 2.5 | 920 | 1.1784 |
0.7813 | 2.53 | 930 | 1.1797 |
0.959 | 2.55 | 940 | 1.1795 |
0.8303 | 2.58 | 950 | 1.1796 |
0.7675 | 2.61 | 960 | 1.1790 |
0.9452 | 2.64 | 970 | 1.1784 |
0.7509 | 2.66 | 980 | 1.1783 |
0.5698 | 2.69 | 990 | 1.1784 |
0.8727 | 2.72 | 1000 | 1.1783 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Treza12/unsloth-test
Base model
ybelkada/falcon-7b-sharded-bf16