Sentence Similarity
Safetensors
English
bert
File size: 6,220 Bytes
45503f5
 
 
 
 
 
 
 
f7f8fe1
 
 
 
 
45503f5
f7f8fe1
 
 
 
 
 
45503f5
 
f7f8fe1
 
 
 
45503f5
 
 
f7f8fe1
 
 
 
45503f5
 
f7f8fe1
 
 
 
 
 
45503f5
 
f7f8fe1
 
 
 
 
45503f5
f7f8fe1
 
 
 
 
45503f5
f7f8fe1
 
 
 
 
 
45503f5
 
f7f8fe1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45503f5
f7f8fe1
 
45503f5
 
1c881fb
b8e722d
45503f5
 
 
 
 
b8e722d
bc63e61
45503f5
 
 
 
 
 
01e8396
 
45503f5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2a0a278
 
 
45503f5
 
 
 
 
 
 
d2a1b1c
2a0a278
e2fd67c
 
 
 
 
e793f94
 
 
 
 
 
 
 
77a399a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
---
language:
- en
license: mit
tags:
- sentence-similarity
base_model: BAAI/bge-small-en-v1.5
widget:
- source_sentence: >-
    Infections affecting the acromioclavicular joint can present with symptoms
    such as shoulder pain, swelling, and potential fever. Patients may
    experience systemic signs of infection, including chills and malaise,
    alongside localized symptoms related to joint movement.
  sentences:
  - >-
    Gastric stromal tumors with uncertain behavior can present with symptoms
    such as abdominal pain, nausea, and changes in appetite. Patients may
    experience psychological distress due to concerns about malignancy and the
    need for further evaluation. The presence of these tumors can lead to
    complications depending on their size and location.
  - 'Disease Name : Complication, umbrella device, vascular, embolism'
  - 'Disease Name : Infection, infected, infective, acromioclavicular'
- source_sentence: >-
    cyanosis: a bluish discoloration of the skin; fatigue: extreme tiredness;
    shortness of breath: difficulty in breathing; headache: pain in the head;
    confusion: difficulty in thinking clearly.
  sentences:
  - 'Disease Name : Anomaly, anomalous, spine, spinal NEC, column NEC, kyphosis'
  - 'Disease Name : Methemoglobinemia, Hb M disease'
  - >-
    Superficial insect bites on the finger can lead to localized swelling,
    redness, and itching. Patients may experience discomfort and potential
    allergic reactions, requiring symptomatic treatment.
- source_sentence: 'Disease Name : Excess, excessive, excessively, crying, in infant'
  sentences:
  - >-
    Excessive crying in infants can manifest as prolonged periods of distress,
    often accompanied by signs of discomfort such as clenching fists, arching of
    the back, and difficulty in soothing. The infant may exhibit a high-pitched
    cry and may seem inconsolable, leading to parental concern and anxiety about
    the child's well-being.
  - 'Disease Name : Syphilis, syphilitic, meningitis'
  - 'Disease Name : Prolapse, prolapsed, ileostomy bud'
- source_sentence: >-
    Cysts affecting the lacrimal passages or sac can lead to symptoms such as
    tearing, swelling, and discomfort in the inner corner of the eye. Patients
    may experience recurrent infections, redness, and a sensation of fullness,
    which can significantly impact tear drainage and overall eye health.
  sentences:
  - >-
    A condition characterized by the fusion of a single suture can lead to
    various physical deformities and developmental challenges. Symptoms may
    include cognitive impairment and various physical challenges that can impact
    daily life.
  - 'Disease Name : Cyst, Lacrimal, Passages or Sac'
  - >-
    This condition presents with swelling and pain in the area of the placenta,
    often accompanied by tenderness and discomfort. Patients may experience
    complications during pregnancy, including bleeding and changes in fetal
    movement. In some cases, there may be associated risks to both the mother
    and the fetus.
- source_sentence: 'Disease Name : Tuberculosis, tubercular, tuberculous, brain'
  sentences:
  - >-
    hoarseness: abnormal voice quality due to vocal cord dysfunction; difficulty
    breathing: shortness of breath or stridor from airway compression; loss of
    voice: inability to speak due to vocal cord paralysis; throat discomfort:
    pain or irritation in the throat area.
  - >-
    Symptoms include cough, wheezing, and difficulty breathing, often
    exacerbated by exposure to specific environmental triggers such as air
    conditioning systems. Patients may also experience fever and malaise, with a
    history of allergic reactions or asthma. The condition can lead to
    significant respiratory distress if not managed appropriately.
  - >-
    Symptoms often include headaches, confusion, and potential neurological
    deficits, along with systemic signs such as fever and fatigue. Patients may
    also experience seizures or altered consciousness, indicating a serious
    underlying condition.
pipeline_tag: sentence-similarity
datasets:
- SalmanFaroz/DisEmbed-Symptom-Disease-v1
---

# DisEmbed (Disease Embedding)
DisEmbed-v1 is a disease-focused embedding model designed for the medical domain, trained on a synthetic dataset comprising disease descriptions, symptoms, and Q&A pairs. It outperforms general medical models in disease-specific tasks, particularly in distinguishing similar diseases. DisEmbed excels in retrieval task and disease-context identification.


## Model Details

### Model Description
- **Dataset : [DisEmbed-Symptom-Disease-v1](https://huggingface.co/datasets/SalmanFaroz/DisEmbed-Symptom-Disease-v1)**
- **Paper : [DisEmbed: Transforming Disease Understanding through Embeddings](https://arxiv.org/abs/2412.15258)**
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 384 dimensions
- **Similarity Function:** Cosine Similarity
- **Language:** en
- **License:** mit

![image/png](https://cdn-uploads.huggingface.co/production/uploads/631772607690c5b55e5b5edd/xqnT8KqYPmDIPTn08bksU.png)

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("SalmanFaroz/DisEmbed-v1")
# Run inference
sentences = [
    'Chronic cough with blood-streaked sputum, severe night sweats, and unintentional weight loss.Painful breathing or chest pain, often worsened by coughing.Swelling in the neck or lymph nodes, and frequent fatigue.',
    'Asthma',
    'Tuberculosis'
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)

```


Citation
```
@article{faroz2024disembed,
  title={DisEmbed: Transforming Disease Understanding through Embeddings},
  author={Faroz, Salman},
  journal={arXiv preprint arXiv:2412.15258},
  year={2024},
  doi={10.48550/arXiv.2412.15258},
  url={https://arxiv.org/abs/2412.15258}
}
```