Upload 11 files
Browse filesUncased model 4 epochs trained on PA_ARXIV,PA_BOOKS and PA_JACOW where all equations, tables, special symbols and numbers were removed.
The preprocessing considerably improves results, and is currently roughly on par with the sentence transformers generally and minor improvements in individual tokens specific for PA community like BPM.
- README.md +5 -15
- model.safetensors +1 -1
README.md
CHANGED
@@ -9,16 +9,11 @@ tags:
|
|
9 |
|
10 |
---
|
11 |
|
12 |
-
#
|
13 |
|
14 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
15 |
|
16 |
-
|
17 |
-
It was trained on a large corpus of scientific literature and papers related to particle accelerators.
|
18 |
-
|
19 |
-
This fine-tuned embedding can be used as input to downstream natural language processing tasks relevant to particle accelerator research and operations,
|
20 |
-
such as information retrieval from logbooks.
|
21 |
-
|
22 |
|
23 |
## Usage (Sentence-Transformers)
|
24 |
|
@@ -89,12 +84,9 @@ For an automated evaluation of this model, see the *Sentence Embeddings Benchmar
|
|
89 |
## Training
|
90 |
The model was trained with the parameters:
|
91 |
|
92 |
-
### Dataset
|
93 |
-
The dataset used are PA_JACOW+PA_BOOKS+PA_ARXIV. Equations, tables, MMD headings (\#), numbers and any special symbols are removed from training input data (see prepare_mmd_eqations_and_tables_for_simcse function in PA_LOGBOOKS/code/mmd.py)
|
94 |
-
|
95 |
**DataLoader**:
|
96 |
|
97 |
-
`torch.utils.data.dataloader.DataLoader` of length
|
98 |
```
|
99 |
{'batch_size': 16, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
|
100 |
```
|
@@ -106,8 +98,6 @@ The dataset used are PA_JACOW+PA_BOOKS+PA_ARXIV. Equations, tables, MMD headings
|
|
106 |
{'scale': 20.0, 'similarity_fct': 'cos_sim'}
|
107 |
```
|
108 |
|
109 |
-
The scaling parameter is based on the paper's suggestion (cos_sim(a,b) / 0.05).
|
110 |
-
|
111 |
Parameters of the fit()-Method:
|
112 |
```
|
113 |
{
|
@@ -121,8 +111,8 @@ Parameters of the fit()-Method:
|
|
121 |
},
|
122 |
"scheduler": "WarmupLinear",
|
123 |
"steps_per_epoch": null,
|
124 |
-
"warmup_steps":
|
125 |
-
"weight_decay": 0.
|
126 |
}
|
127 |
```
|
128 |
|
|
|
9 |
|
10 |
---
|
11 |
|
12 |
+
# {MODEL_NAME}
|
13 |
|
14 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
15 |
|
16 |
+
<!--- Describe your model here -->
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
## Usage (Sentence-Transformers)
|
19 |
|
|
|
84 |
## Training
|
85 |
The model was trained with the parameters:
|
86 |
|
|
|
|
|
|
|
87 |
**DataLoader**:
|
88 |
|
89 |
+
`torch.utils.data.dataloader.DataLoader` of length 25444 with parameters:
|
90 |
```
|
91 |
{'batch_size': 16, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
|
92 |
```
|
|
|
98 |
{'scale': 20.0, 'similarity_fct': 'cos_sim'}
|
99 |
```
|
100 |
|
|
|
|
|
101 |
Parameters of the fit()-Method:
|
102 |
```
|
103 |
{
|
|
|
111 |
},
|
112 |
"scheduler": "WarmupLinear",
|
113 |
"steps_per_epoch": null,
|
114 |
+
"warmup_steps": 0.0,
|
115 |
+
"weight_decay": 0.01
|
116 |
}
|
117 |
```
|
118 |
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 439776096
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ff2d9a7b55a2a8465c79a0f39a49e753080836960026685a3abdd8fbeb16fa25
|
3 |
size 439776096
|