NicoZenith
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -35,7 +35,7 @@ RadVLM is a compact, multitask vision-language model designed for conversational
|
|
35 |
- Bounding Boxes (if applicable): Coordinates indicating the location of anatomical structures or abnormalities.
|
36 |
|
37 |
## Model Architecture
|
38 |
-
- Backbone: LLaVA-OneVision-7B (https://huggingface.co/llava-hf/llava-onevision-qwen2-7b-si-hf
|
39 |
- Vision Encoder: SigLIP, used for image feature extraction.
|
40 |
- Instruction Tuning: Fine-tuned with multi-task objectives, covering report generation, abnormality detection, and multi-turn Q&A.
|
41 |
|
|
|
35 |
- Bounding Boxes (if applicable): Coordinates indicating the location of anatomical structures or abnormalities.
|
36 |
|
37 |
## Model Architecture
|
38 |
+
- Backbone: LLaVA-OneVision-7B (https://huggingface.co/llava-hf/llava-onevision-qwen2-7b-si-hf), a vision-language model adapted for medical tasks.
|
39 |
- Vision Encoder: SigLIP, used for image feature extraction.
|
40 |
- Instruction Tuning: Fine-tuned with multi-task objectives, covering report generation, abnormality detection, and multi-turn Q&A.
|
41 |
|