Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ pipeline_tag: text-generation
|
|
12 |
This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) using [trl](https://github.com/huggingface/trl) on [ultrafeedback dataset](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
|
13 |
|
14 |
# What's new
|
15 |
-
A test for
|
16 |
|
17 |
## How to reproduce
|
18 |
```bash
|
|
|
12 |
This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) using [trl](https://github.com/huggingface/trl) on [ultrafeedback dataset](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
|
13 |
|
14 |
# What's new
|
15 |
+
A test for [ORPO: Monolithic Preference Optimization without Reference Model](https://arxiv.org/pdf/2403.07691.pdf) method using trl library.
|
16 |
|
17 |
## How to reproduce
|
18 |
```bash
|