Argu_T3 / README.md
Marco127's picture
Add new SentenceTransformer model
5459d52 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:672
  - loss:ContrastiveLoss
base_model: sentence-transformers/multi-qa-mpnet-base-dot-v1
widget:
  - source_sentence: >-

      Animals may not be allowed onto beds or other furniture, which serves for

      guests. It is not permitted to use baths, showers or washbasins for
      bathing or

      washing animals.
    sentences:
      - >-

        Please advise of any special needs such as high-chairs and sleeping
        cots.
      - >-

        Animals may not be allowed onto beds or other furniture, which serves
        for

        guests. It is not permitted to use baths, showers or washbasins for
        bathing or

        washing animals.
      - >-

        It is strongly advised that you arrange adequate insurance cover such as
        cancellation due to illness,

        accident or injury, personal accident and personal liability, loss of or
        damage to baggage and sport

        equipment (Note that is not an exhaustive list). We will not be
        responsible or liable if you fail to take

        adequate insurance cover or none at all.
  - source_sentence: >-
      Owners are responsible for ensuring that animals are kept quiet between
      the

      hours of 10:00 pm and 06:00 am. In the case of failure to abide by this

      regulation the guest may be asked to leave the hotel without a refund of
      the

      price of the night's accommodation.
    sentences:
      - >-

        Visitors are not allowed in the rooms and must be entertained in the
        lounges and/or other public areas

        provided.
      - >-
        To ensure the safety and comfort of everyone in the hotel, the
        Management

        reserves the right to terminate the accommodation of guests who fail to
        comply

        with the following rules and regulations.
      - >-
        Owners are responsible for ensuring that animals are kept quiet between
        the

        hours of 10:00 pm and 06:00 am. In the case of failure to abide by this

        regulation the guest may be asked to leave the hotel without a refund of
        the

        price of the night's accommodation.
  - source_sentence: >-

      We ask all guests to behave in such a way that they do not disturb other
      guests and the neighborhood.

      The hotel staff is authorized to refuse services to a person who violates
      this rule.
    sentences:
      - >-

        Please take note of the limitation specified for the room you have
        booked.

        If such number is exceeded, whether temporarily or over-night, we
        reserve the right to do one or more of

        the following: cancel your booking; retain all the monies you've paid;
        request you to vacate your room(s)

        forthwith, charge a higher rate for the room or recover all monies due.
      - >-

        We ask all guests to behave in such a way that they do not disturb other
        guests and the neighborhood.

        The hotel staff is authorized to refuse services to a person who
        violates this rule.
      - >-
        We will only deal with your information as indicated in the
        booking/reservation and we will only process your

        personal information (both terms as defined in the Protection of
        Personal Information Act, act 4 of 2013 ['the

        POPIA'] and the European Union General Data Protection Regulation 
        ('GDPR') and any Special Personal

        Information (as defined in the GDPR & POPIA), which processing includes
        amongst others the 'collecting,

        storing and dissemination' of your personal information (as defined in
        GDPR & POPIA).
  - source_sentence: >-

      All articles stored in the luggage storage room are received at the
      owner’s own risk.
    sentences:
      - >2-

         Unregistered visitors are not permitted to enter guest rooms or other areas of
        the hotel. An additional fee for unregistered guests will be charged to
        the

        account of the guest(s) registered to the room.
      - >-
        Please advise us if you anticipate arriving late as bookings will be
        cancelled by 17:00 on the day of arrival,

        unless we have been so notified.
      - >-

        All articles stored in the luggage storage room are received at the
        owner’s own risk.
  - source_sentence: >2-
       In the event of a disturbance, one polite request (warning) will
      be given to reduce the noise. If our request is not followed, the guest
      will be asked to leave

      the hotel without refund and may be charged Guest Compensation Disturbance
      Fee.
    sentences:
      - >-

        Without limiting the generality of the aforementioned, it applies to
        pay-to-view TV programmes or videos, as

        well as telephone calls or any other expenses of a similar nature that
        is made from your room, you will be

        deemed to be the contracting party.
      - >-
        Pets are not allowed in the restaurant during breakfast time

        (7:00  10:30) for hygienic reasons due to the breakfast’s buffet style.
        An

        exception is the case when the hotel terrace is open, as pets can be
        taken to

        the terrace through the hotel's main entrance and they can stay there
        during

        breakfast.
      - >2-
         In the event of a disturbance, one polite request (warning) will
        be given to reduce the noise. If our request is not followed, the guest
        will be asked to leave

        the hotel without refund and may be charged Guest Compensation
        Disturbance Fee.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - dot_accuracy
  - dot_accuracy_threshold
  - dot_f1
  - dot_f1_threshold
  - dot_precision
  - dot_recall
  - dot_ap
  - dot_mcc
model-index:
  - name: >-
      SentenceTransformer based on
      sentence-transformers/multi-qa-mpnet-base-dot-v1
    results:
      - task:
          type: binary-classification
          name: Binary Classification
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: dot_accuracy
            value: 0.6745562130177515
            name: Dot Accuracy
          - type: dot_accuracy_threshold
            value: 49.0201301574707
            name: Dot Accuracy Threshold
          - type: dot_f1
            value: 0.4932735426008969
            name: Dot F1
          - type: dot_f1_threshold
            value: 35.02415466308594
            name: Dot F1 Threshold
          - type: dot_precision
            value: 0.32934131736526945
            name: Dot Precision
          - type: dot_recall
            value: 0.9821428571428571
            name: Dot Recall
          - type: dot_ap
            value: 0.3294144882113245
            name: Dot Ap
          - type: dot_mcc
            value: -0.03920743101752848
            name: Dot Mcc

SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1

This is a sentence-transformers model finetuned from sentence-transformers/multi-qa-mpnet-base-dot-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Marco127/Argu_T3")
# Run inference
sentences = [
    ' In the event of a disturbance, one polite request (warning) will\nbe given to reduce the noise. If our request is not followed, the guest will be asked to leave\nthe hotel without refund and may be charged Guest Compensation Disturbance Fee.',
    ' In the event of a disturbance, one polite request (warning) will\nbe given to reduce the noise. If our request is not followed, the guest will be asked to leave\nthe hotel without refund and may be charged Guest Compensation Disturbance Fee.',
    '\nWithout limiting the generality of the aforementioned, it applies to pay-to-view TV programmes or videos, as\nwell as telephone calls or any other expenses of a similar nature that is made from your room, you will be\ndeemed to be the contracting party.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Binary Classification

Metric Value
dot_accuracy 0.6746
dot_accuracy_threshold 49.0201
dot_f1 0.4933
dot_f1_threshold 35.0242
dot_precision 0.3293
dot_recall 0.9821
dot_ap 0.3294
dot_mcc -0.0392

Training Details

Training Dataset

Unnamed Dataset

  • Size: 672 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 672 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 11 tokens
    • mean: 48.63 tokens
    • max: 156 tokens
    • min: 11 tokens
    • mean: 48.63 tokens
    • max: 156 tokens
    • 0: ~66.67%
    • 1: ~33.33%
  • Samples:
    sentence1 sentence2 label

    The pets can not be left without supervision if there is a risk of causing any
    damage or might disturb other guests.

    The pets can not be left without supervision if there is a risk of causing any
    damage or might disturb other guests.
    0

    Any guest in violation of these rules may be asked to leave the hotel with no refund. Extra copies of these
    rules are available at the Front Desk upon request.

    Any guest in violation of these rules may be asked to leave the hotel with no refund. Extra copies of these
    rules are available at the Front Desk upon request.
    0

    Consuming the products from the minibar involves additional costs. You can find the
    prices in the kitchen area.

    Consuming the products from the minibar involves additional costs. You can find the
    prices in the kitchen area.
    0
  • Loss: ContrastiveLoss with these parameters:
    {
        "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
        "margin": 0.5,
        "size_average": true
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 169 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 169 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 13 tokens
    • mean: 46.01 tokens
    • max: 156 tokens
    • min: 13 tokens
    • mean: 46.01 tokens
    • max: 156 tokens
    • 0: ~66.86%
    • 1: ~33.14%
  • Samples:
    sentence1 sentence2 label

    I understand and accept that the BON Hotels Group collects the personal information ("personal
    information") of all persons in my party for purposes of loyalty programmes and special offers. I, on behalf of
    all in my party, expressly consent and grant permission to the BON Hotels Group to: -
    collect, collate, process, study and use the personal information; and
    communicate directly with me/us from time to time, unless I have stated to the contrary below.

    I understand and accept that the BON Hotels Group collects the personal information ("personal
    information") of all persons in my party for purposes of loyalty programmes and special offers. I, on behalf of
    all in my party, expressly consent and grant permission to the BON Hotels Group to: -
    collect, collate, process, study and use the personal information; and
    communicate directly with me/us from time to time, unless I have stated to the contrary below.
    0
    However, in lieu of the above, any such goods will only be kept by us for 6 (six) months. At the end of which
    period, we reserve the right in our sole discretion to dispose thereof and you will have no right of recourse
    against us.
    However, in lieu of the above, any such goods will only be kept by us for 6 (six) months. At the end of which
    period, we reserve the right in our sole discretion to dispose thereof and you will have no right of recourse
    against us.
    0
    In cases where the hotel
    suffers damage (either physical, or moral) due to the guests’ violation of the above rules, it
    may charge a compensation fee in proportion to the damage. Moral damage may be for
    example disturbing other guests, thus ruining the reputation of the hotel.
    In cases where the hotel
    suffers damage (either physical, or moral) due to the guests’ violation of the above rules, it
    may charge a compensation fee in proportion to the damage. Moral damage may be for
    example disturbing other guests, thus ruining the reputation of the hotel.
    0
  • Loss: ContrastiveLoss with these parameters:
    {
        "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
        "margin": 0.5,
        "size_average": true
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 1e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step dot_ap
-1 -1 0.3294

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ContrastiveLoss

@inproceedings{hadsell2006dimensionality,
    author={Hadsell, R. and Chopra, S. and LeCun, Y.},
    booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
    title={Dimensionality Reduction by Learning an Invariant Mapping},
    year={2006},
    volume={2},
    number={},
    pages={1735-1742},
    doi={10.1109/CVPR.2006.100}
}