fm2 / README.md
vineet10's picture
Add new SentenceTransformer model.
94c6711 verified
|
raw
history blame
26.7 kB
metadata
base_model: BAAI/bge-base-en-v1.5
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:48
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      The Supplier shall deliver the Batteries to the Manufacturer within 5 days
      of receipt of each monthly purchase order.
    sentences:
      - >-
        What rights does the Manufacturer have regarding the inspection and
        rejection  non-conforming Batteries?
      - What is the Delivery Schedule for the Batteries?
      - What constitutes a force majeure event under the Agreement?
  - source_sentence: >-
      The Client will pay a flat fee of Rs. 52,000/-, with 50% (Rs. 26,000/-)
      due upon signing the agreement and the remaining 50% due one week after
      completion of pre-production. Payment delays will result in proportional
      delays in data delivery and editing.
    sentences:
      - >-
        What are the specified payment terms for the photography services under
        this contract?
      - >-
        What actions can a user take if the platform is unable to fulfill a
        successfully placed order?
      - >-
        What is the delivery schedule for the Batteries once the purchase order
        is received?
  - source_sentence: >-
      Users can contact Customer Care before confirmation to request a refund
      for offline services or reschedule for online services, subject to the
      platform's discretion.
    sentences:
      - >-
        How does Paratalks handle refund requests made before a service
        professional confirms a booking?
      - >-
        What is the total quantity of electric vehicle batteries that the
        Supplier agrees to supply to the Manufacturer?
      - >-
        What are the conditions under which a user is not entitled to a refund
        according to Paratalks' refund policy?
  - source_sentence: >-
      In the event of a material breach of this Agreement by either Party, the
      non-breaching Party shall be entitled to pursue all available remedies at
      law or in equity, including injunctive relief and damages.
    sentences:
      - Under what conditions may this agreement be terminated?
      - What events constitute Force Majeure under this Agreement?
      - >-
        What remedies are available to a non-breaching Party in the event of a
        material breach of the Agreement?
  - source_sentence: >-
      No refund shall be issued in case wrong contact details are provided by
      the User or the User's device being unreachable, or any other technical
      glitch attributable to the User. Additionally, no refund shall be issued
      for any live-session or call, whether audio or video, once the call or
      live-session is connected.
    sentences:
      - >-
        What deductions may be applied when processing refunds according to
        Paratalks' refund policy?
      - >-
        What are the initial job title and duties of the Employee as stated in
        the employment agreement?
      - >-
        What circumstances lead to no refund being issued to a User according to
        the Refund Policy?
model-index:
  - name: SentenceTransformer based on BAAI/bge-base-en-v1.5
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.3333333333333333
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.3333333333333333
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3333333333333333
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19999999999999998
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.3333333333333333
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.7321315434523954
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.638888888888889
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.638888888888889
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.3333333333333333
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.3333333333333333
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3333333333333333
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19999999999999998
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.3333333333333333
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.7321315434523954
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.638888888888889
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.638888888888889
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.5
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.8333333333333334
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.5
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.27777777777777773
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19999999999999998
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.5
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.8333333333333334
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.7747853857295762
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.7000000000000001
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.7000000000000001
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.5
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.8333333333333334
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.8333333333333334
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.5
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.27777777777777773
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.16666666666666666
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.5
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.8333333333333334
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.8333333333333334
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.7604815838011495
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.6851851851851851
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.6851851851851851
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.5
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.6666666666666666
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.6666666666666666
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.8333333333333334
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.5
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.2222222222222222
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.13333333333333333
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.08333333333333333
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.5
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.6666666666666666
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.6666666666666666
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.8333333333333334
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.66452282344658
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.611111111111111
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.6262626262626262
            name: Cosine Map@100

SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("vineet10/fm2")
# Run inference
sentences = [
    "No refund shall be issued in case wrong contact details are provided by the User or the User's device being unreachable, or any other technical glitch attributable to the User. Additionally, no refund shall be issued for any live-session or call, whether audio or video, once the call or live-session is connected.",
    'What circumstances lead to no refund being issued to a User according to the Refund Policy?',
    'What are the initial job title and duties of the Employee as stated in the employment agreement?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.3333
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.3333
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.3333
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.7321
cosine_mrr@10 0.6389
cosine_map@100 0.6389

Information Retrieval

Metric Value
cosine_accuracy@1 0.3333
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.3333
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.3333
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.7321
cosine_mrr@10 0.6389
cosine_map@100 0.6389

Information Retrieval

Metric Value
cosine_accuracy@1 0.5
cosine_accuracy@3 0.8333
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.5
cosine_precision@3 0.2778
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.5
cosine_recall@3 0.8333
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.7748
cosine_mrr@10 0.7
cosine_map@100 0.7

Information Retrieval

Metric Value
cosine_accuracy@1 0.5
cosine_accuracy@3 0.8333
cosine_accuracy@5 0.8333
cosine_accuracy@10 1.0
cosine_precision@1 0.5
cosine_precision@3 0.2778
cosine_precision@5 0.1667
cosine_precision@10 0.1
cosine_recall@1 0.5
cosine_recall@3 0.8333
cosine_recall@5 0.8333
cosine_recall@10 1.0
cosine_ndcg@10 0.7605
cosine_mrr@10 0.6852
cosine_map@100 0.6852

Information Retrieval

Metric Value
cosine_accuracy@1 0.5
cosine_accuracy@3 0.6667
cosine_accuracy@5 0.6667
cosine_accuracy@10 0.8333
cosine_precision@1 0.5
cosine_precision@3 0.2222
cosine_precision@5 0.1333
cosine_precision@10 0.0833
cosine_recall@1 0.5
cosine_recall@3 0.6667
cosine_recall@5 0.6667
cosine_recall@10 0.8333
cosine_ndcg@10 0.6645
cosine_mrr@10 0.6111
cosine_map@100 0.6263

Training Details

Training Dataset

Unnamed Dataset

  • Size: 48 training samples
  • Columns: context and question
  • Approximate statistics based on the first 1000 samples:
    context question
    type string string
    details
    • min: 18 tokens
    • mean: 41.0 tokens
    • max: 85 tokens
    • min: 8 tokens
    • mean: 17.88 tokens
    • max: 32 tokens
  • Samples:
    context question
    The contract is governed by the laws of India, and any disputes shall be resolved exclusively by the courts in Kota. What is the jurisdiction and governing law applicable to this contract?
    The Parties shall maintain the confidentiality of all proprietary and confidential information disclosed by one Party to the other Party in connection with this Agreement. How should proprietary and confidential information disclosed under the Agreement be treated by the Parties?
    No refund shall be provided for any products or merchandise that is purchased by the User from or through the Platform. What is the refund policy for products or merchandise purchased by the User through the Platform?
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 5
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0 0 0.6852 0.7000 0.6389 0.6263 0.6389

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}