Commit
·
d35f522
1
Parent(s):
687e701
Update README.md
Browse files
README.md
CHANGED
@@ -109,7 +109,7 @@ This fine-tuning approach allowed us to significantly reduce memory usage and co
|
|
109 |
## Evaluation results
|
110 |
|
111 |
To evaluate the performance of our model, we translated [70 questions](https://github.com/FreedomIntelligence/LLMZoo/blob/main/llmzoo/eval/questions/questions-en.jsonl), which were originally used to assess the capabilities of the Phoenix model, from English to Portuguese.
|
112 |
-
We then conducted their [automatic evaluation](https://github.com/FreedomIntelligence/LLMZoo) using GTP-3.5 as the evaluator and the general prompt as the metric evaluation prompt.
|
113 |
This prompt was designed to elicit assessments of answers in terms of helpfulness, relevance, accuracy, and level of detail.
|
114 |
[Additional prompts](https://github.com/FreedomIntelligence/LLMZoo/blob/main/llmzoo/eval/prompts/order/prompt_all.json) are provided for assessing overall performance on different perspectives.
|
115 |
|
|
|
109 |
## Evaluation results
|
110 |
|
111 |
To evaluate the performance of our model, we translated [70 questions](https://github.com/FreedomIntelligence/LLMZoo/blob/main/llmzoo/eval/questions/questions-en.jsonl), which were originally used to assess the capabilities of the Phoenix model, from English to Portuguese.
|
112 |
+
We then conducted their [automatic evaluation](https://github.com/FreedomIntelligence/LLMZoo/tree/main/llmzoo/eval) using GTP-3.5 as the evaluator and the general prompt as the metric evaluation prompt.
|
113 |
This prompt was designed to elicit assessments of answers in terms of helpfulness, relevance, accuracy, and level of detail.
|
114 |
[Additional prompts](https://github.com/FreedomIntelligence/LLMZoo/blob/main/llmzoo/eval/prompts/order/prompt_all.json) are provided for assessing overall performance on different perspectives.
|
115 |
|