File size: 1,010 Bytes
b8f032b 7c9289d b8f032b 7c9289d 6aee81a 7c9289d 8adca73 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
Specialized vision-language models for clinical ophthalmology (specifically AMD in retinal OCT)
See paper [https://arxiv.org/abs/2407.08410](https://arxiv.org/abs/2407.08410)
These versions of the model are not applicable for clinical use, as they were developed for research purposes
These models use the 8 bit versions of meta-llama/Meta-Llama-3-8B-Instruct
They were designed to accept a fovea-centered retinal OCT image of size 192x192, with physical pixel dimensions of 7.0×23.4 μm2, from the Topcon scanner
These models also accept an associated textual instruction and outputs a textual response
The results in paper relate to these specifications, and performance cannot be guaranteed for other image types, sizes or anatomical locations in the retina
To use RetinaVLM, first clone: [https://github.com/RobbieHolland/SpecialistVLMs](https://github.com/RobbieHolland/SpecialistVLMs)
And run: `models/retinavlm_wrapper.py model=minigpt4 dataset/task=all pretrained_models=specialist_v5_192px`
|