Model Details
Model Description
This model is a fine-tuned version of cambridgeltl/SapBERT-from-PubMedBERT-fulltext on the DDXPlus dataset (50,000 samples) for medical diagnosis tasks.
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: Aashish Acharya
- Model type: sapBERT-BioMedBERT
- Language(s): English
- License: MIT
- Finetuned from model: cambridgeltl/SapBERT-from-PubMedBERT-fulltext
Model Sources
- Repository: cambridgeltl/SapBERT-from-PubMedBERT-fulltext
- Dataset: aai530-group6/ddxplus
Training Dataset
The model was trained on DDXPlus dataset (50,000 samples) containing:
- Patient cases with comprehensive medical information
- Differential diagnosis annotations
- 49 distinct medical conditions
- Evidence-based symptom-condition relationships
Performance
Final Metrics
- Test Precision: 0.8159
- Test Recall: 0.7948
- Test F1 Score: 0.7420
Training Evolution
- Best Validation Loss: 1.4600
- Stopped at Epoch 4 (Early stopping triggered)
Intended Use
This model is designed for:
- Medical diagnosis support
- Symptom analysis
- Disease classification
- Differential diagnosis generation
Out-of-Scope Use
The model should NOT be used for:
- Direct medical diagnosis without professional oversight
- Critical healthcare decisions without human validation
- Clinical applications without proper testing and validation
Training Details
Training Procedure
- Optimizer: AdamW with weight decay (0.01)
- Learning Rate: 1e-5
- Loss Function: Combined loss (0.8 × Focal Loss + 0.2 × KL Divergence)
- Batch Size: 32
- Gradient Clipping: 1.0
- Early Stopping: Patience of 3 epochs
- Training Strategy: Cross-validation with 5 folds
Model Architecture
- Base Model: cambridgeltl/SapBERT-from-PubMedBERT-fulltext
- Hidden Size: 768
- Attention Heads: 12
- Dropout Rate: 0.5
- Added classification layers for diagnostic tasks
- Layer normalization and dropout for regularization
Example Usage
from transformers import AutoTokenizer, AutoModel # Load model and tokenizer model_name = "acharya-jyu/sapbert-pubmedbert-ddxplus-10k" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModel.from_pretrained(model_name) # Example input structure input_data = { 'age': 45, # Patient age 'sex': 'M', # Patient sex: 'M' or 'F' 'initial_evidence': 'E_91', # Initial evidence code (e.g., E_91 for fever) 'evidences': [ 'E_91', # Fever 'E_77', # Cough 'E_89' # Fatigue ] } # Process demographic data and evidence codes outputs = model(**input_data) # Outputs will include: # - Main diagnosis prediction # - Differential diagnosis probabilities # - Confidence scores
Note: Evidence codes (E_XX) correspond to specific symptoms and conditions defined in the release_evidences.json file. The model expects these standardized codes rather than raw text input.
Citation
@misc{acharya2024sapbert,
title={SapBERT-PubMedBERT Fine-tuned on DDXPlus Dataset},
author={Acharya, Aashish},
year={2024},
publisher={Hugging Face Model Hub}
}
Model Card Contact
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.