microsoft
/

rad-dino

Image Feature Extraction

Transformers

Safetensors

dinov2

Inference Endpoints

Model card Files Files and versions Community

fepegar commited on 10 days ago

Commit

7d8a1e4

verified ·

1 Parent(s): 72881b8

Update references to the paper

Browse files

Files changed (1) hide show

README.md +13 -31

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ library_name: transformers
 RAD-DINO is a vision transformer model trained to encode chest X-rays using the self-supervised learning method [DINOv2](https://openreview.net/forum?id=a68SUt6zFt).
-RAD-DINO is described in detail in [RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision (F. Pérez-García, H. Sharma, S. Bond-Taylor, et al., 2024)](https://arxiv.org/abs/2401.10815).
 - **Developed by:** Microsoft Health Futures
 - **Model type:** Vision transformer
@@ -151,7 +151,7 @@ We used 16 nodes with 4 A100 GPUs each, and a batch size of 40 images per GPU.
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-We refer to the [manuscript](https://arxiv.org/abs/2401.10815) for a detailed description of the training procedure.
 #### Preprocessing
@@ -167,27 +167,7 @@ All DICOM files were resized using B-spline interpolation so that their shorter
 <!-- This section describes the evaluation protocols and provides the results. -->
-Our evaluation is best described in the [manuscript](https://arxiv.org/abs/2401.10815).
-<!-- ### Testing data, factors & metrics
-#### Testing Data
-[More Information Needed]
-#### Factors
-[More Information Needed]
-#### Metrics
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary -->
 ## Environmental impact
@@ -226,19 +206,21 @@ We used [SimpleITK](https://simpleitk.org/) and [Pydicom](https://pydicom.github
 **BibTeX:**
 ```bibtex
-@misc{perezgarcia2024raddino,
-      title={{RAD-DINO}: Exploring Scalable Medical Image Encoders Beyond Text Supervision},
-      author={Fernando Pérez-García and Harshita Sharma and Sam Bond-Taylor and Kenza Bouzid and Valentina Salvatelli and Maximilian Ilse and Shruthi Bannur and Daniel C. Castro and Anton Schwaighofer and Matthew P. Lungren and Maria Wetscherek and Noel Codella and Stephanie L. Hyland and Javier Alvarez-Valle and Ozan Oktay},
-      year={2024},
-      eprint={2401.10815},
-      archivePrefix={arXiv},
-      primaryClass={cs.CV}
 }
 ```
 **APA:**
-> Pérez-García, F., Sharma, H., Bond-Taylor, S., Bouzid, K., Salvatelli, V., Ilse, M., Bannur, S., Castro, D.C., Schwaighofer, A., Lungren, M.P., Wetscherek, M.T., Codella, N., Hyland, S.L., Alvarez-Valle, J., & Oktay, O. (2024). *RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision*. ArXiv, abs/2401.10815.
 ## Model card contact

 RAD-DINO is a vision transformer model trained to encode chest X-rays using the self-supervised learning method [DINOv2](https://openreview.net/forum?id=a68SUt6zFt).
+RAD-DINO is described in detail in [Exploring Scalable Medical Image Encoders Beyond Text Supervision (F. Pérez-García, H. Sharma, S. Bond-Taylor, et al., 2024)](https://www.nature.com/articles/s42256-024-00965-w).
 - **Developed by:** Microsoft Health Futures
 - **Model type:** Vision transformer
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+We refer to the [manuscript](https://www.nature.com/articles/s42256-024-00965-w) for a detailed description of the training procedure.
 #### Preprocessing
 <!-- This section describes the evaluation protocols and provides the results. -->
+Our evaluation is best described in the [manuscript](https://www.nature.com/articles/s42256-024-00965-w).
 ## Environmental impact
 **BibTeX:**
 ```bibtex
+@article{perez-garcia_exploring_2025,
+	title = {Exploring scalable medical image encoders beyond text supervision},
+	issn = {2522-5839},
+	url = {https://doi.org/10.1038/s42256-024-00965-w},
+	doi = {10.1038/s42256-024-00965-w},
+	journal = {Nature Machine Intelligence},
+	author = {P{\'e}rez-Garc{\'i}a, Fernando and Sharma, Harshita and Bond-Taylor, Sam and Bouzid, Kenza and Salvatelli, Valentina and Ilse, Maximilian and Bannur, Shruthi and Castro, Daniel C. and Schwaighofer, Anton and Lungren, Matthew P. and Wetscherek, Maria Teodora and Codella, Noel and Hyland, Stephanie L. and Alvarez-Valle, Javier and Oktay, Ozan},
+	month = jan,
+	year = {2025},
 }
 ```
 **APA:**
+> Pérez-García, F., Sharma, H., Bond-Taylor, S., Bouzid, K., Salvatelli, V., Ilse, M., Bannur, S., Castro, D. C., Schwaighofer, A., Lungren, M. P., Wetscherek, M. T., Codella, N., Hyland, S. L., Alvarez-Valle, J., & Oktay, O. (2025). *Exploring scalable medical image encoders beyond text supervision*. In Nature Machine Intelligence. Springer Science and Business Media LLC. https://doi.org/10.1038/s42256-024-00965-w
 ## Model card contact