Marco127 commited on
Commit
644a36f
·
verified ·
1 Parent(s): 24734c1

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,596 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:684
8
+ - loss:MultipleNegativesRankingLoss
9
+ base_model: sentence-transformers/multi-qa-mpnet-base-dot-v1
10
+ widget:
11
+ - source_sentence: '
12
+
13
+ We request that guests report any complaints and defects to the hotel reception
14
+ or hotel
15
+
16
+ management in person. Your complaints shall be attended to immediately.'
17
+ sentences:
18
+ - '
19
+
20
+ Animals may not be allowed onto beds or other furniture, which serves for
21
+
22
+ guests. It is not permitted to use baths, showers or washbasins for bathing or
23
+
24
+ washing animals.'
25
+ - '
26
+
27
+ We request that guests report any complaints and defects to the hotel reception
28
+ or hotel
29
+
30
+ management in person. Your complaints shall be attended to immediately.'
31
+ - '
32
+
33
+ Guests who take accommodation after midnight, shall still pay the price for
34
+
35
+ accommodation for the whole of the preceding night. The hotel’s official Check-in
36
+ time is
37
+
38
+ from 02:00 pm. For a possible early check-in, please consult with the reservation
39
+ team, or
40
+
41
+ the reception in advance.'
42
+ - source_sentence: '
43
+
44
+ Hotel guests may receive visits in their hotel rooms from guests not staying in
45
+ the hotel.
46
+
47
+ Visitors must present a personal document at the hotel reception and register
48
+ in the visitors''
49
+
50
+ book. These visits can last for only a maximum of 2 hours and must finish until
51
+ 10:00 pm.'
52
+ sentences:
53
+ - '
54
+
55
+ Hotel guests may receive visits in their hotel rooms from guests not staying in
56
+ the hotel.
57
+
58
+ Visitors must present a personal document at the hotel reception and register
59
+ in the visitors''
60
+
61
+ book. These visits can last for only a maximum of 2 hours and must finish until
62
+ 10:00 pm.'
63
+ - ' If you do not want someone to enter
64
+
65
+ your room, please hang the "do not disturb” card on your room’s outside door handle.
66
+ It can
67
+
68
+ be found in the entrance area of your room.'
69
+ - '
70
+
71
+ Hotel guests may receive visits in their hotel rooms from guests not staying in
72
+ the hotel.
73
+
74
+ Visitors must present a personal document at the hotel reception and register
75
+ in the visitors''
76
+
77
+ book. These visits can last for only a maximum of 2 hours and must finish until
78
+ 10:00 pm.'
79
+ - source_sentence: '
80
+
81
+ Guests may not use their own electrical appliances in the hotel building except
82
+ for those
83
+
84
+ serving for personal hygiene (electrical shavers or massaging machines, hairdryers
85
+ etc.), or
86
+
87
+ personal computers and telephone chargers. The rooms own electrical devices shall
88
+ only be
89
+
90
+ used according to their main purpose.'
91
+ sentences:
92
+ - '
93
+
94
+ Pets are allowed in the hotel restaurant only from 12:00, provided the
95
+
96
+ animal''s behavior and cleanliness are adequate and they do not disturb other
97
+
98
+ guests. '
99
+ - '
100
+
101
+ Guests may not use their own electrical appliances in the hotel building except
102
+ for those
103
+
104
+ serving for personal hygiene (electrical shavers or massaging machines, hairdryers
105
+ etc.), or
106
+
107
+ personal computers and telephone chargers. The rooms own electrical devices shall
108
+ only be
109
+
110
+ used according to their main purpose.'
111
+ - ' For a possible late check-out please consult with the reception
112
+
113
+ in time, and upon availability we may grant a later check-out for a supplemental
114
+ fee.'
115
+ - source_sentence: '
116
+
117
+ The hotel may provide accommodation only for guests who register in the regular
118
+
119
+ manner. For this purpose, the guest must present a personal document (citizen''s
120
+
121
+ identification card), or a valid passport to the receptionist. Accepting these
122
+ Rules of the
123
+
124
+ House is also obligatory for the registration.'
125
+ sentences:
126
+ - '
127
+
128
+ Hotel guests are obliged to abide by the provisions of these hotel regulations.
129
+ In the case of
130
+
131
+ serious violation, the reception or hotel management may withdraw from the contract
132
+ on
133
+
134
+ accommodation services before the elapse of the agreed period.'
135
+ - '
136
+
137
+ Hotel guests are responsible for given room keys during their whole stay. In case
138
+ of loss, the
139
+
140
+ guests are asked to inform reception staff immediately in order to prevent abusing
141
+ the key.
142
+
143
+ Losing the room key will result in a penalty of 20 Eur, which is to be paid on
144
+ the spot, at the
145
+
146
+ reception.'
147
+ - '
148
+
149
+ The hotel may provide accommodation only for guests who register in the regular
150
+
151
+ manner. For this purpose, the guest must present a personal document (citizen''s
152
+
153
+ identification card), or a valid passport to the receptionist. Accepting these
154
+ Rules of the
155
+
156
+ House is also obligatory for the registration.'
157
+ - source_sentence: '
158
+
159
+ Guests are responsible for damages caused to hotel property according to the valid
160
+ legal
161
+
162
+ prescriptions of Hungary.'
163
+ sentences:
164
+ - '
165
+
166
+ We shall be happy to listen to any suggestions for improvement of the accommodation
167
+
168
+ and catering services in the hotel. In case of any complaints we shall purposefully
169
+ arrange
170
+
171
+ the rectification of any insufficiencies.'
172
+ - '
173
+
174
+ Guests are responsible for damages caused to hotel property according to the valid
175
+ legal
176
+
177
+ prescriptions of Hungary.'
178
+ - '
179
+
180
+ Guests are responsible for damages caused to hotel property according to the valid
181
+ legal
182
+
183
+ prescriptions of Hungary.'
184
+ pipeline_tag: sentence-similarity
185
+ library_name: sentence-transformers
186
+ metrics:
187
+ - dot_accuracy
188
+ - dot_accuracy_threshold
189
+ - dot_f1
190
+ - dot_f1_threshold
191
+ - dot_precision
192
+ - dot_recall
193
+ - dot_ap
194
+ - dot_mcc
195
+ model-index:
196
+ - name: SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1
197
+ results:
198
+ - task:
199
+ type: binary-classification
200
+ name: Binary Classification
201
+ dataset:
202
+ name: Unknown
203
+ type: unknown
204
+ metrics:
205
+ - type: dot_accuracy
206
+ value: 0.6549707602339181
207
+ name: Dot Accuracy
208
+ - type: dot_accuracy_threshold
209
+ value: 48.36168670654297
210
+ name: Dot Accuracy Threshold
211
+ - type: dot_f1
212
+ value: 0.5142857142857143
213
+ name: Dot F1
214
+ - type: dot_f1_threshold
215
+ value: 40.011634826660156
216
+ name: Dot F1 Threshold
217
+ - type: dot_precision
218
+ value: 0.36
219
+ name: Dot Precision
220
+ - type: dot_recall
221
+ value: 0.9
222
+ name: Dot Recall
223
+ - type: dot_ap
224
+ value: 0.3570718807651215
225
+ name: Dot Ap
226
+ - type: dot_mcc
227
+ value: 0.03879793956580217
228
+ name: Dot Mcc
229
+ ---
230
+
231
+ # SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1
232
+
233
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
234
+
235
+ ## Model Details
236
+
237
+ ### Model Description
238
+ - **Model Type:** Sentence Transformer
239
+ - **Base model:** [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1) <!-- at revision 4633e80e17ea975bc090c97b049da26062b054d3 -->
240
+ - **Maximum Sequence Length:** 512 tokens
241
+ - **Output Dimensionality:** 768 dimensions
242
+ - **Similarity Function:** Dot Product
243
+ <!-- - **Training Dataset:** Unknown -->
244
+ <!-- - **Language:** Unknown -->
245
+ <!-- - **License:** Unknown -->
246
+
247
+ ### Model Sources
248
+
249
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
250
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
251
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
252
+
253
+ ### Full Model Architecture
254
+
255
+ ```
256
+ SentenceTransformer(
257
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel
258
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
259
+ )
260
+ ```
261
+
262
+ ## Usage
263
+
264
+ ### Direct Usage (Sentence Transformers)
265
+
266
+ First install the Sentence Transformers library:
267
+
268
+ ```bash
269
+ pip install -U sentence-transformers
270
+ ```
271
+
272
+ Then you can load this model and run inference.
273
+ ```python
274
+ from sentence_transformers import SentenceTransformer
275
+
276
+ # Download from the 🤗 Hub
277
+ model = SentenceTransformer("Marco127/Base_T")
278
+ # Run inference
279
+ sentences = [
280
+ '\nGuests are responsible for damages caused to hotel property according to the valid legal\nprescriptions of Hungary.',
281
+ '\nGuests are responsible for damages caused to hotel property according to the valid legal\nprescriptions of Hungary.',
282
+ '\nWe shall be happy to listen to any suggestions for improvement of the accommodation\nand catering services in the hotel. In case of any complaints we shall purposefully arrange\nthe rectification of any insufficiencies.',
283
+ ]
284
+ embeddings = model.encode(sentences)
285
+ print(embeddings.shape)
286
+ # [3, 768]
287
+
288
+ # Get the similarity scores for the embeddings
289
+ similarities = model.similarity(embeddings, embeddings)
290
+ print(similarities.shape)
291
+ # [3, 3]
292
+ ```
293
+
294
+ <!--
295
+ ### Direct Usage (Transformers)
296
+
297
+ <details><summary>Click to see the direct usage in Transformers</summary>
298
+
299
+ </details>
300
+ -->
301
+
302
+ <!--
303
+ ### Downstream Usage (Sentence Transformers)
304
+
305
+ You can finetune this model on your own dataset.
306
+
307
+ <details><summary>Click to expand</summary>
308
+
309
+ </details>
310
+ -->
311
+
312
+ <!--
313
+ ### Out-of-Scope Use
314
+
315
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
316
+ -->
317
+
318
+ ## Evaluation
319
+
320
+ ### Metrics
321
+
322
+ #### Binary Classification
323
+
324
+ * Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
325
+
326
+ | Metric | Value |
327
+ |:-----------------------|:-----------|
328
+ | dot_accuracy | 0.655 |
329
+ | dot_accuracy_threshold | 48.3617 |
330
+ | dot_f1 | 0.5143 |
331
+ | dot_f1_threshold | 40.0116 |
332
+ | dot_precision | 0.36 |
333
+ | dot_recall | 0.9 |
334
+ | **dot_ap** | **0.3571** |
335
+ | dot_mcc | 0.0388 |
336
+
337
+ <!--
338
+ ## Bias, Risks and Limitations
339
+
340
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
341
+ -->
342
+
343
+ <!--
344
+ ### Recommendations
345
+
346
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
347
+ -->
348
+
349
+ ## Training Details
350
+
351
+ ### Training Dataset
352
+
353
+ #### Unnamed Dataset
354
+
355
+ * Size: 684 training samples
356
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
357
+ * Approximate statistics based on the first 684 samples:
358
+ | | sentence1 | sentence2 | label |
359
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
360
+ | type | string | string | int |
361
+ | details | <ul><li>min: 17 tokens</li><li>mean: 42.77 tokens</li><li>max: 71 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 42.77 tokens</li><li>max: 71 tokens</li></ul> | <ul><li>0: ~67.11%</li><li>1: ~32.89%</li></ul> |
362
+ * Samples:
363
+ | sentence1 | sentence2 | label |
364
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
365
+ | <code> If a guest fails to vacate<br>the room within the designated time, reception shall charge this guest for the following<br>night's accommodation fee.</code> | <code> If a guest fails to vacate<br>the room within the designated time, reception shall charge this guest for the following<br>night's accommodation fee.</code> | <code>0</code> |
366
+ | <code> If you do not want someone to enter<br>your room, please hang the "do not disturb” card on your room’s outside door handle. It can<br>be found in the entrance area of your room.</code> | <code> If you do not want someone to enter<br>your room, please hang the "do not disturb” card on your room’s outside door handle. It can<br>be found in the entrance area of your room.</code> | <code>0</code> |
367
+ | <code><br>Owners are responsible for ensuring that animals are kept quiet between the<br>hours of 10:00 pm and 06:00 am. In the case of failure to abide by this<br>regulation the guest may be asked to leave the hotel without a refund of the<br>price of the night's accommodation.</code> | <code><br>Owners are responsible for ensuring that animals are kept quiet between the<br>hours of 10:00 pm and 06:00 am. In the case of failure to abide by this<br>regulation the guest may be asked to leave the hotel without a refund of the<br>price of the night's accommodation.</code> | <code>0</code> |
368
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
369
+ ```json
370
+ {
371
+ "scale": 20.0,
372
+ "similarity_fct": "cos_sim"
373
+ }
374
+ ```
375
+
376
+ ### Evaluation Dataset
377
+
378
+ #### Unnamed Dataset
379
+
380
+ * Size: 171 evaluation samples
381
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
382
+ * Approximate statistics based on the first 171 samples:
383
+ | | sentence1 | sentence2 | label |
384
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
385
+ | type | string | string | int |
386
+ | details | <ul><li>min: 17 tokens</li><li>mean: 42.01 tokens</li><li>max: 71 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 42.01 tokens</li><li>max: 71 tokens</li></ul> | <ul><li>0: ~64.91%</li><li>1: ~35.09%</li></ul> |
387
+ * Samples:
388
+ | sentence1 | sentence2 | label |
389
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
390
+ | <code><br>We shall be happy to listen to any suggestions for improvement of the accommodation<br>and catering services in the hotel. In case of any complaints we shall purposefully arrange<br>the rectification of any insufficiencies.</code> | <code><br>We shall be happy to listen to any suggestions for improvement of the accommodation<br>and catering services in the hotel. In case of any complaints we shall purposefully arrange<br>the rectification of any insufficiencies.</code> | <code>0</code> |
391
+ | <code><br>Between the hours of 10:00 pm and 06:00 am guests are obliged to maintain low noise<br>levels.</code> | <code><br>Between the hours of 10:00 pm and 06:00 am guests are obliged to maintain low noise<br>levels.</code> | <code>0</code> |
392
+ | <code><br>The hotel’s inner courtyard parking facility may be used only upon availability of parking<br>slots. Slots marked as ’Private’ are to be left free for their owners. For parking fees please<br>consult the reception or see the website of the hotel.</code> | <code><br>The hotel’s inner courtyard parking facility may be used only upon availability of parking<br>slots. Slots marked as ’Private’ are to be left free for their owners. For parking fees please<br>consult the reception or see the website of the hotel.</code> | <code>1</code> |
393
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
394
+ ```json
395
+ {
396
+ "scale": 20.0,
397
+ "similarity_fct": "cos_sim"
398
+ }
399
+ ```
400
+
401
+ ### Training Hyperparameters
402
+ #### Non-Default Hyperparameters
403
+
404
+ - `eval_strategy`: steps
405
+ - `per_device_train_batch_size`: 16
406
+ - `per_device_eval_batch_size`: 16
407
+ - `learning_rate`: 2e-05
408
+ - `num_train_epochs`: 5
409
+ - `warmup_ratio`: 0.1
410
+ - `fp16`: True
411
+ - `batch_sampler`: no_duplicates
412
+
413
+ #### All Hyperparameters
414
+ <details><summary>Click to expand</summary>
415
+
416
+ - `overwrite_output_dir`: False
417
+ - `do_predict`: False
418
+ - `eval_strategy`: steps
419
+ - `prediction_loss_only`: True
420
+ - `per_device_train_batch_size`: 16
421
+ - `per_device_eval_batch_size`: 16
422
+ - `per_gpu_train_batch_size`: None
423
+ - `per_gpu_eval_batch_size`: None
424
+ - `gradient_accumulation_steps`: 1
425
+ - `eval_accumulation_steps`: None
426
+ - `torch_empty_cache_steps`: None
427
+ - `learning_rate`: 2e-05
428
+ - `weight_decay`: 0.0
429
+ - `adam_beta1`: 0.9
430
+ - `adam_beta2`: 0.999
431
+ - `adam_epsilon`: 1e-08
432
+ - `max_grad_norm`: 1.0
433
+ - `num_train_epochs`: 5
434
+ - `max_steps`: -1
435
+ - `lr_scheduler_type`: linear
436
+ - `lr_scheduler_kwargs`: {}
437
+ - `warmup_ratio`: 0.1
438
+ - `warmup_steps`: 0
439
+ - `log_level`: passive
440
+ - `log_level_replica`: warning
441
+ - `log_on_each_node`: True
442
+ - `logging_nan_inf_filter`: True
443
+ - `save_safetensors`: True
444
+ - `save_on_each_node`: False
445
+ - `save_only_model`: False
446
+ - `restore_callback_states_from_checkpoint`: False
447
+ - `no_cuda`: False
448
+ - `use_cpu`: False
449
+ - `use_mps_device`: False
450
+ - `seed`: 42
451
+ - `data_seed`: None
452
+ - `jit_mode_eval`: False
453
+ - `use_ipex`: False
454
+ - `bf16`: False
455
+ - `fp16`: True
456
+ - `fp16_opt_level`: O1
457
+ - `half_precision_backend`: auto
458
+ - `bf16_full_eval`: False
459
+ - `fp16_full_eval`: False
460
+ - `tf32`: None
461
+ - `local_rank`: 0
462
+ - `ddp_backend`: None
463
+ - `tpu_num_cores`: None
464
+ - `tpu_metrics_debug`: False
465
+ - `debug`: []
466
+ - `dataloader_drop_last`: False
467
+ - `dataloader_num_workers`: 0
468
+ - `dataloader_prefetch_factor`: None
469
+ - `past_index`: -1
470
+ - `disable_tqdm`: False
471
+ - `remove_unused_columns`: True
472
+ - `label_names`: None
473
+ - `load_best_model_at_end`: False
474
+ - `ignore_data_skip`: False
475
+ - `fsdp`: []
476
+ - `fsdp_min_num_params`: 0
477
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
478
+ - `fsdp_transformer_layer_cls_to_wrap`: None
479
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
480
+ - `deepspeed`: None
481
+ - `label_smoothing_factor`: 0.0
482
+ - `optim`: adamw_torch
483
+ - `optim_args`: None
484
+ - `adafactor`: False
485
+ - `group_by_length`: False
486
+ - `length_column_name`: length
487
+ - `ddp_find_unused_parameters`: None
488
+ - `ddp_bucket_cap_mb`: None
489
+ - `ddp_broadcast_buffers`: False
490
+ - `dataloader_pin_memory`: True
491
+ - `dataloader_persistent_workers`: False
492
+ - `skip_memory_metrics`: True
493
+ - `use_legacy_prediction_loop`: False
494
+ - `push_to_hub`: False
495
+ - `resume_from_checkpoint`: None
496
+ - `hub_model_id`: None
497
+ - `hub_strategy`: every_save
498
+ - `hub_private_repo`: None
499
+ - `hub_always_push`: False
500
+ - `gradient_checkpointing`: False
501
+ - `gradient_checkpointing_kwargs`: None
502
+ - `include_inputs_for_metrics`: False
503
+ - `include_for_metrics`: []
504
+ - `eval_do_concat_batches`: True
505
+ - `fp16_backend`: auto
506
+ - `push_to_hub_model_id`: None
507
+ - `push_to_hub_organization`: None
508
+ - `mp_parameters`:
509
+ - `auto_find_batch_size`: False
510
+ - `full_determinism`: False
511
+ - `torchdynamo`: None
512
+ - `ray_scope`: last
513
+ - `ddp_timeout`: 1800
514
+ - `torch_compile`: False
515
+ - `torch_compile_backend`: None
516
+ - `torch_compile_mode`: None
517
+ - `dispatch_batches`: None
518
+ - `split_batches`: None
519
+ - `include_tokens_per_second`: False
520
+ - `include_num_input_tokens_seen`: False
521
+ - `neftune_noise_alpha`: None
522
+ - `optim_target_modules`: None
523
+ - `batch_eval_metrics`: False
524
+ - `eval_on_start`: False
525
+ - `use_liger_kernel`: False
526
+ - `eval_use_gather_object`: False
527
+ - `average_tokens_across_devices`: False
528
+ - `prompts`: None
529
+ - `batch_sampler`: no_duplicates
530
+ - `multi_dataset_batch_sampler`: proportional
531
+
532
+ </details>
533
+
534
+ ### Training Logs
535
+ | Epoch | Step | Training Loss | Validation Loss | dot_ap |
536
+ |:------:|:----:|:-------------:|:---------------:|:------:|
537
+ | -1 | -1 | - | - | 0.3571 |
538
+ | 2.2791 | 100 | 0.0011 | 0.0000 | - |
539
+ | 4.5581 | 200 | 0.0 | 0.0000 | - |
540
+
541
+
542
+ ### Framework Versions
543
+ - Python: 3.11.11
544
+ - Sentence Transformers: 3.4.1
545
+ - Transformers: 4.48.3
546
+ - PyTorch: 2.5.1+cu124
547
+ - Accelerate: 1.3.0
548
+ - Datasets: 3.2.0
549
+ - Tokenizers: 0.21.0
550
+
551
+ ## Citation
552
+
553
+ ### BibTeX
554
+
555
+ #### Sentence Transformers
556
+ ```bibtex
557
+ @inproceedings{reimers-2019-sentence-bert,
558
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
559
+ author = "Reimers, Nils and Gurevych, Iryna",
560
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
561
+ month = "11",
562
+ year = "2019",
563
+ publisher = "Association for Computational Linguistics",
564
+ url = "https://arxiv.org/abs/1908.10084",
565
+ }
566
+ ```
567
+
568
+ #### MultipleNegativesRankingLoss
569
+ ```bibtex
570
+ @misc{henderson2017efficient,
571
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
572
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
573
+ year={2017},
574
+ eprint={1705.00652},
575
+ archivePrefix={arXiv},
576
+ primaryClass={cs.CL}
577
+ }
578
+ ```
579
+
580
+ <!--
581
+ ## Glossary
582
+
583
+ *Clearly define terms in order to be accessible across audiences.*
584
+ -->
585
+
586
+ <!--
587
+ ## Model Card Authors
588
+
589
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
590
+ -->
591
+
592
+ <!--
593
+ ## Model Card Contact
594
+
595
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
596
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/multi-qa-mpnet-base-dot-v1",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.48.3",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.1",
4
+ "transformers": "4.48.3",
5
+ "pytorch": "2.5.1+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "dot"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dad03cbd630093647a3070ddf920f98114bc76b0b3b454142b8dcac4822490ae
3
+ size 437967672
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[UNK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30526": {
44
+ "content": "<mask>",
45
+ "lstrip": true,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ }
51
+ },
52
+ "bos_token": "<s>",
53
+ "clean_up_tokenization_spaces": false,
54
+ "cls_token": "<s>",
55
+ "do_lower_case": true,
56
+ "eos_token": "</s>",
57
+ "extra_special_tokens": {},
58
+ "mask_token": "<mask>",
59
+ "max_length": 250,
60
+ "model_max_length": 512,
61
+ "pad_to_multiple_of": null,
62
+ "pad_token": "<pad>",
63
+ "pad_token_type_id": 0,
64
+ "padding_side": "right",
65
+ "sep_token": "</s>",
66
+ "stride": 0,
67
+ "strip_accents": null,
68
+ "tokenize_chinese_chars": true,
69
+ "tokenizer_class": "MPNetTokenizer",
70
+ "truncation_side": "right",
71
+ "truncation_strategy": "longest_first",
72
+ "unk_token": "[UNK]"
73
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff