avemio-digital commited on
Commit
d8edc46
verified
1 Parent(s): d390516

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -9
README.md CHANGED
@@ -133,16 +133,17 @@ Four evaluation metrics were employed across all subsets: language quality, over
133
  - **Overall score:** This metric combined the results from the previous three metrics, offering a comprehensive evaluation of the model's capabilities across all subsets.
134
 
135
 
136
- | Metric | [Vanila-Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) | [GRAG-NEMO-SFT](https://huggingface.co/avemio/GRAG-NEMO-12B-SFT-HESSIAN-AI) | [GRAG-NEMO-ORPO](https://huggingface.co/avemio/GRAG-NEMO-12B-ORPO-HESSIAN-AI) | [GRAG-NEMO-MERGED]() | GPT-3.5-TURBO |
137
  |------------------------------------------|---------------------------------------------------------------------------------|--------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------|----------------|
138
- | **Average_language_quality** | 85.88 | 89.61 | 89.1 | | |
139
- | **extraction_recall_weighted_overall_score** | 35.2 | 52.3 | 48.8 | | |
140
- | **qa_multiple_references_weighted_overall_score** | 65.3 | 71.0 | 74.0 | | |
141
- | **qa_without_time_difference_weighted_overall_score** | 71.5 | 85.6 | 85.6 | | |
142
- | **qa_with_time_difference_weighted_overall_score** | 65.3 | 87.9 | 85.4 | | |
143
- | **reasoning_weighted_overall_score** | 69.4 | 71.5 | 73.4 | | |
144
- | **relevant_context_weighted_overall_score** | 71.3 | 69.1 | 65.5 | | |
145
- | **summarizations_weighted_overall_score** | 73.8 | 81.6 | 80.3 | | |
 
146
 
147
  ## Model Details
148
 
 
133
  - **Overall score:** This metric combined the results from the previous three metrics, offering a comprehensive evaluation of the model's capabilities across all subsets.
134
 
135
 
136
+ | Metric | [Vanila-Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) | **[GRAG-NEMO-SFT](https://huggingface.co/avemio/GRAG-NEMO-12B-SFT-HESSIAN-AI)** | [GRAG-NEMO-ORPO](https://huggingface.co/avemio/GRAG-NEMO-12B-ORPO-HESSIAN-AI) | [GRAG-NEMO-MERGED]() | GPT-3.5-TURBO |
137
  |------------------------------------------|---------------------------------------------------------------------------------|--------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------|----------------|
138
+ | Average Language Quality | 85.88 | **89.61** | 89.1 | | |
139
+ | **OVERALL SCORES (weighted):** | | | | | |
140
+ | extraction_recall | 35.2 | **52.3** | 48.8 | | |
141
+ | qa_multiple_references | 65.3 | **71.0** | 74.0 | | |
142
+ | qa_without_time_difference | 71.5 | **85.6** | 85.6 | | |
143
+ | qa_with_time_difference | 65.3 | **87.9** | 85.4 | | |
144
+ | reasoning | 69.4 | **71.5** | 73.4 | | |
145
+ | relevant_context | 71.3 | **69.1** | 65.5 | | |
146
+ | summarizations | 73.8 | **81.6** | 80.3 | | |
147
 
148
  ## Model Details
149