Update README.md
Browse files
README.md
CHANGED
|
@@ -59,4 +59,15 @@ The following hyperparameters were used during training:
|
|
| 59 |
- Transformers 4.36.2
|
| 60 |
- Pytorch 2.1.2
|
| 61 |
- Datasets 2.15.0
|
| 62 |
-
- Tokenizers 0.15.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
- Transformers 4.36.2
|
| 60 |
- Pytorch 2.1.2
|
| 61 |
- Datasets 2.15.0
|
| 62 |
+
- Tokenizers 0.15.0
|
| 63 |
+
|
| 64 |
+
### BibTex Citation
|
| 65 |
+
If you would like to cite our paper when using the model, please use
|
| 66 |
+
```
|
| 67 |
+
@article{sun2024supervised,
|
| 68 |
+
title={Supervised Fine-Tuning as Inverse Reinforcement Learning},
|
| 69 |
+
author={Sun, Hao},
|
| 70 |
+
journal={arXiv preprint arXiv:2403.12017},
|
| 71 |
+
year={2024}
|
| 72 |
+
}
|
| 73 |
+
```
|