Commit
·
1806313
1
Parent(s):
dfabe92
Update README.md
Browse files
README.md
CHANGED
@@ -180,6 +180,8 @@ In particular, when we started building [distilabel](https://github.com/argilla-
|
|
180 |
|
181 |
Using preference ratings, instead of critiques scores, led to a new dataset where the chosen response is different in ~50% of the cases. Using this new dataset with DPO we fine-tuned Notus, a 7B model, that **surpasses Zephyr-7B-beta and Claude 2 on AlpacaEval**.
|
182 |
|
|
|
|
|
183 |
This model **wouldn't have been possible without the amazing [Alignment Handbook](https://github.com/huggingface/alignment-handbook), [OpenBMB](https://www.openbmb.cn/home) for releasing the Ultrafeedback dataset**, and it's based on fruitful discussions with the HuggingFace H4 team. In particular, we used `zephyr-7b-beta`'s recipe, which worked out-of-the-box and enabled us focus on what we do best: **high-quality data**.
|
184 |
|
185 |
Notus models are intended to be used as assistants via chat-like applications, and are evaluated with Chat (MT-Bench, AlpacaEval) and Academic (Open LLM Leaderboard) benchmarks for a direct comparison with the original Zephyr dDPO model and other 7B models.
|
@@ -326,23 +328,28 @@ We used a VM with 8 x A100 40GB hosted in Lambda Labs, but while experimenting w
|
|
326 |
|
327 |
### Training Data
|
328 |
|
329 |
-
We used a a new curated version [`
|
330 |
-
of [`openbmb/UltraFeedback`](https://huggingface.co/datasets/openbmb/UltraFeedback), named [argilla/ultrafeedback-binarized-preferences](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences).
|
331 |
|
332 |
TL;DR
|
333 |
|
334 |
After visually browsing around some examples using the sort and filter feature of Argilla (sort by highest rating for chosen responses), we noticed a strong mismatch between the `overall_score` in the original UF dataset (and the Zephyr train_prefs dataset) and the quality of the chosen response.
|
335 |
|
336 |
-
By adding the critique rationale to our Argilla Dataset, we confirmed the critique rationale was highly negative, whereas the rating was very high (the highest
|
337 |
|
338 |
See screenshot below for one example of this issue.
|
339 |
|
340 |
-
After some quick investigation, we
|
|
|
|
|
|
|
|
|
341 |
|
342 |
While we're working on fixing the original dataset (already narrowed down ~2K problematic examples). We decided to leverage the multi-preference ratings, leading to Notus!
|
343 |
|
344 |

|
345 |
|
|
|
|
|
346 |
You can find more details about the dataset analysis and curation on the [ultrafeedback-binarized-preferences dataset card](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences).
|
347 |
|
348 |
## Prompt template
|
|
|
180 |
|
181 |
Using preference ratings, instead of critiques scores, led to a new dataset where the chosen response is different in ~50% of the cases. Using this new dataset with DPO we fine-tuned Notus, a 7B model, that **surpasses Zephyr-7B-beta and Claude 2 on AlpacaEval**.
|
182 |
|
183 |
+
> **Important note**: While we opted for the average of ratings while we fix the original dataset, a very interesting open question remains: once critique data is fixed, what works better? using the critique scores or the preference ratings? We're very excited to do this comparison in the coming weeks, stay tuned!
|
184 |
+
|
185 |
This model **wouldn't have been possible without the amazing [Alignment Handbook](https://github.com/huggingface/alignment-handbook), [OpenBMB](https://www.openbmb.cn/home) for releasing the Ultrafeedback dataset**, and it's based on fruitful discussions with the HuggingFace H4 team. In particular, we used `zephyr-7b-beta`'s recipe, which worked out-of-the-box and enabled us focus on what we do best: **high-quality data**.
|
186 |
|
187 |
Notus models are intended to be used as assistants via chat-like applications, and are evaluated with Chat (MT-Bench, AlpacaEval) and Academic (Open LLM Leaderboard) benchmarks for a direct comparison with the original Zephyr dDPO model and other 7B models.
|
|
|
328 |
|
329 |
### Training Data
|
330 |
|
331 |
+
We used a a new curated version of [`openbmb/UltraFeedback`](https://huggingface.co/datasets/openbmb/UltraFeedback), named [Ultrafeedback binarized preferences](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences).
|
|
|
332 |
|
333 |
TL;DR
|
334 |
|
335 |
After visually browsing around some examples using the sort and filter feature of Argilla (sort by highest rating for chosen responses), we noticed a strong mismatch between the `overall_score` in the original UF dataset (and the Zephyr train_prefs dataset) and the quality of the chosen response.
|
336 |
|
337 |
+
By adding the critique rationale to our Argilla Dataset, **we confirmed the critique rationale was highly negative, whereas the rating was very high** (for most cases it was the highest: `10`).
|
338 |
|
339 |
See screenshot below for one example of this issue.
|
340 |
|
341 |
+
After some quick investigation, we:
|
342 |
+
|
343 |
+
* identified hundreds of examples having the same issue,
|
344 |
+
* reported a bug on the [UltraFeedback repo](https://github.com/OpenBMB/UltraFeedback/issues/8),
|
345 |
+
* and informed the H4 team which was incredibly responsive and ran an additional experiment to validate the new rating binarization approach.
|
346 |
|
347 |
While we're working on fixing the original dataset (already narrowed down ~2K problematic examples). We decided to leverage the multi-preference ratings, leading to Notus!
|
348 |
|
349 |

|
350 |
|
351 |
+
> **Important note**: While we opted for the average of ratings while we fix the dataset, there's still a very interesting open question: once data is fixed, what works better? using the critique scores or the preference ratings? We're very excited to do this comparison in the coming weeks, stay tuned!
|
352 |
+
|
353 |
You can find more details about the dataset analysis and curation on the [ultrafeedback-binarized-preferences dataset card](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences).
|
354 |
|
355 |
## Prompt template
|