lukeingawesome commited on
Commit
2f8ec38
·
verified ·
1 Parent(s): 1880e8e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -19
README.md CHANGED
@@ -17,11 +17,15 @@ library_name: transformers
17
 
18
  # LLM2Vec4CXR - Fine-tuned Model for Chest X-ray Report Analysis
19
 
20
- This model is a fine-tuned version of [microsoft/LLM2CLIP-Llama-3.2-1B-Instruct-CC-Finetuned](https://huggingface.co/microsoft/LLM2CLIP-Llama-3.2-1B-Instruct-CC-Finetuned) specifically optimized for chest X-ray report analysis and medical text understanding.
 
 
 
21
 
22
  ## Model Description
23
 
24
- LLM2Vec4CXR is a bidirectional language model that converts the base decoder-only LLM into a text encoder optimized for medical text embeddings. The model has been fully fine-tuned with modified pooling strategy (`latent_attention`) to better capture semantic relationships in chest X-ray reports.
 
25
 
26
  ### Key Features
27
 
@@ -161,6 +165,47 @@ best_match = options[torch.argmax(scores)]
161
  print(f"Best match: {best_match} (score: {torch.max(scores):.4f})")
162
  ```
163
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
  ## API Reference
165
 
166
  The model provides several convenient methods:
@@ -172,19 +217,12 @@ The model provides several convenient methods:
172
  - **`compute_similarities(query_text, candidate_texts)`**: One-line similarity computation
173
  - **`from_pretrained(..., pooling_mode="latent_attention")`**: Automatic latent attention weight loading
174
 
175
- ### Migration from Manual Usage
176
 
177
- If you were previously using manual tokenization, you can now simply use:
 
 
 
178
 
179
- ```python
180
- # Old way (still works)
181
- tokenized = model.tokenizer(text, return_tensors="pt", ...)
182
- tokenized["embed_mask"] = tokenized["attention_mask"].clone()
183
- embeddings = model(tokenized)
184
-
185
- # New way (recommended)
186
- embeddings = model.encode_text([text])
187
- ```
188
 
189
  ## Evaluation
190
 
@@ -195,7 +233,11 @@ The model has been evaluated on chest X-ray report analysis tasks, particularly
195
 
196
  ### Sample Performance
197
 
198
- The model shows improved performance compared to the base model on medical text understanding tasks, particularly in distinguishing between different pleural effusion states and medical abbreviations.
 
 
 
 
199
 
200
  ## Intended Use
201
 
@@ -224,11 +266,11 @@ The model shows improved performance compared to the base model on medical text
224
  If you use this model in your research, please cite:
225
 
226
  ```bibtex
227
- @misc{llm2vec4cxr,
228
- title={LLM2Vec4CXR: Fine-tuned LLM for Chest X-ray Report Analysis},
229
- author={Hanbin Ko},
230
- year={2025},
231
- howpublished={\\url{https://huggingface.co/lukeingawesome/llm2vec4cxr}},
232
  }
233
  ```
234
 
 
17
 
18
  # LLM2Vec4CXR - Fine-tuned Model for Chest X-ray Report Analysis
19
 
20
+
21
+ LLM2Vec4CXR is optimized for chest X-ray report analysis and medical text understanding.
22
+ It is introduced in our paper [Exploring the Capabilities of LLM Encoders for Image–Text Retrieval in Chest X-rays](https://arxiv.org/pdf/2509.15234).
23
+
24
 
25
  ## Model Description
26
 
27
+ LLM2Vec4CXR is a **bidirectional text encoder** fine-tuned with a `latent_attention` pooling strategy.
28
+ This design enhances semantic representation of chest X-ray reports, improving performance on clinical text similarity, retrieval, and interpretation tasks.
29
 
30
  ### Key Features
31
 
 
165
  print(f"Best match: {best_match} (score: {torch.max(scores):.4f})")
166
  ```
167
 
168
+ Or retrieving clinically similar reports:
169
+ ```
170
+ import torch
171
+ from llm2vec_wrapper import LLM2VecWrapper as LLM2Vec
172
+
173
+ # Load model
174
+ device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
175
+ model = LLM2Vec.from_pretrained(
176
+ base_model_name_or_path='lukeingawesome/llm2vec4cxr',
177
+ pooling_mode="latent_attention",
178
+ max_length=512,
179
+ enable_bidirectional=True,
180
+ torch_dtype=torch.bfloat16,
181
+ use_safetensors=True,
182
+ ).to(device).eval()
183
+
184
+ # Configure tokenizer
185
+ model.tokenizer.padding_side = 'left'
186
+
187
+ # Instruction for retrieval
188
+ instruction = 'Retrieve semantically similar sentences'
189
+ query_report = "There is a small LLLF PE with basal atelectasis."
190
+ query_text = instruction + '!@#$%^&*()' + query_report
191
+
192
+ # Candidate reports
193
+ candidate_reports = [
194
+ "No acute cardiopulmonary abnormality.",
195
+ "Small left pleural effusion is present.",
196
+ "Large right pleural effusion causing compressive atelectasis.",
197
+ "Heart size is normal with no evidence of pleural effusion.",
198
+ "There is left pleural effusion."
199
+ ]
200
+
201
+ # Compute similarity scores
202
+ scores = model.compute_similarities(query_text, candidate_reports)
203
+
204
+ # Retrieve the most similar report
205
+ best_match = candidate_reports[torch.argmax(scores)]
206
+ print(f"Most similar report: {best_match} (score: {torch.max(scores):.4f})")
207
+ ```
208
+
209
  ## API Reference
210
 
211
  The model provides several convenient methods:
 
217
  - **`compute_similarities(query_text, candidate_texts)`**: One-line similarity computation
218
  - **`from_pretrained(..., pooling_mode="latent_attention")`**: Automatic latent attention weight loading
219
 
 
220
 
221
+ 📄 **Related Papers**:
222
+ - [Exploring the Capabilities of LLM Encoders for Image–Text Retrieval in Chest X-rays](https://arxiv.org/pdf/2509.15234)
223
+ *Ko, Hanbin, et al. "Exploring the capabilities of LLM encoders for image–text retrieval in chest X-rays." arXiv preprint arXiv:2509.15234 (2025).*
224
+ - [LLM2CLIP4CXR](https://github.com/lukeingawesome/llm2clip4cxr): A CLIP-based model that leverages the LLM2Vec encoder to align visual and textual representations of chest X-rays.
225
 
 
 
 
 
 
 
 
 
 
226
 
227
  ## Evaluation
228
 
 
233
 
234
  ### Sample Performance
235
 
236
+ The model demonstrates consistent improvements over the base LLM2CLIP architecture on medical text understanding benchmarks.
237
+ In particular, **LLM2Vec4CXR** shows stronger performance in:
238
+ - Handling medical abbreviations and radiological terminology
239
+ - Capturing fine-grained semantic differences in chest X-ray reports
240
+
241
 
242
  ## Intended Use
243
 
 
266
  If you use this model in your research, please cite:
267
 
268
  ```bibtex
269
+ @article{ko2025exploring,
270
+ title={Exploring the Capabilities of LLM Encoders for Image--Text Retrieval in Chest X-rays},
271
+ author={Ko, Hanbin and Cho, Gihun and Baek, Inhyeok and Kim, Donguk and Koo, Joonbeom and Kim, Changi and Lee, Dongheon and Park, Chang Min},
272
+ journal={arXiv preprint arXiv:2509.15234},
273
+ year={2025}
274
  }
275
  ```
276