Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,9 @@ short_description: Reasoning + Deep Research + API(NVIDIA H100 GPU)
|
|
11 |
models:
|
12 |
- VIDraft/Gemma-3-R1984-27B
|
13 |
---
|
|
|
14 |
FACTS Grounding Leaderboard - Medical AI Evaluation
|
|
|
15 |
🏥 Overview
|
16 |
FACTS Grounding is an AI reliability evaluation system developed by Google DeepMind that verifies whether AI responses are grounded solely in provided documents. This evaluation is particularly crucial in healthcare, where inaccurate information can be life-threatening.
|
17 |
🎯 Key Features
|
@@ -25,14 +27,13 @@ Dual-Criteria Assessment
|
|
25 |
✅ Grounding Check: Are all responses based on the provided documents?
|
26 |
|
27 |
|
28 |
-
|
29 |
Medical-Focused Version
|
30 |
|
31 |
236 medical cases selected from 860 total problems
|
32 |
Strict evaluation criteria reflecting healthcare field requirements
|
33 |
|
34 |
🏆 Current Leaderboard Rankings (June 5, 2025)
|
35 |
-
Overall Score TOP 5
|
36 |
|
37 |
1. deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
|
38 |
2. VIDraft/Gemma-3-R1984-27B
|
|
|
11 |
models:
|
12 |
- VIDraft/Gemma-3-R1984-27B
|
13 |
---
|
14 |
+
|
15 |
FACTS Grounding Leaderboard - Medical AI Evaluation
|
16 |
+
|
17 |
🏥 Overview
|
18 |
FACTS Grounding is an AI reliability evaluation system developed by Google DeepMind that verifies whether AI responses are grounded solely in provided documents. This evaluation is particularly crucial in healthcare, where inaccurate information can be life-threatening.
|
19 |
🎯 Key Features
|
|
|
27 |
✅ Grounding Check: Are all responses based on the provided documents?
|
28 |
|
29 |
|
|
|
30 |
Medical-Focused Version
|
31 |
|
32 |
236 medical cases selected from 860 total problems
|
33 |
Strict evaluation criteria reflecting healthcare field requirements
|
34 |
|
35 |
🏆 Current Leaderboard Rankings (June 5, 2025)
|
36 |
+
Overall Score TOP 5(https://huggingface.co/spaces/MaziyarPanahi/FACTS-Leaderboard)
|
37 |
|
38 |
1. deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
|
39 |
2. VIDraft/Gemma-3-R1984-27B
|