Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ pipeline_tag: text-generation
|
|
16 |
|
17 |
Introducing Llama-3-TenyxChat-70B, part of our TenyxChat series trained to function as useful assistants through preference tuning, using Tenyx's advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our model is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
|
18 |
|
19 |
-
We fine-tune [Llama3-70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) with our proprietary approach
|
20 |
which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685)*, without a drop in performance of the model on other benchmarks.
|
21 |
Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner,
|
22 |
thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution.
|
|
|
16 |
|
17 |
Introducing Llama-3-TenyxChat-70B, part of our TenyxChat series trained to function as useful assistants through preference tuning, using Tenyx's advanced fine-tuning technology ([VentureBeat article](https://venturebeat.com/ai/tenyx-aims-to-fix-llms-catastrophic-forgetting-problem/)). Our model is trained using the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) framework on the open-source AI feedback dataset [UltraFeedback](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized).
|
18 |
|
19 |
+
We fine-tune [Llama3-70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) with our proprietary approach
|
20 |
which shows an increase in [MT-Bench](https://arxiv.org/abs/2306.05685)*, without a drop in performance of the model on other benchmarks.
|
21 |
Our approach aims to mitigate forgetting in LLMs in a computationally efficient manner,
|
22 |
thereby enabling continual fine-tuning capabilities without altering the pre-trained output distribution.
|