Vignesh finetuned bge2

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("viggypoker1/Vignesh-finetuned-bge2")
# Run inference
sentences = [
    "What does the term 'Acquired brands' refer to and how does it affect the reported volumes?",
    "'Acquired brands' refers to brands acquired during the past 12 months. Typically, the Company has not reported unit case volume or recognized concentrate sales volume related to acquired brands in periods prior to the closing of a transaction. Therefore, the unit case volume and concentrate sales volume related to an acquired brand are incremental to prior year volume.",
    'The Company made matching contributions to employee accounts in connection with the 401(k) plan of $37.3 million in fiscal 2023, $37.9 million in fiscal 2022 and $34.1 million in fiscal 2021.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.7
cosine_accuracy@3 0.8414
cosine_accuracy@5 0.8786
cosine_accuracy@10 0.92
cosine_precision@1 0.7
cosine_precision@3 0.2805
cosine_precision@5 0.1757
cosine_precision@10 0.092
cosine_recall@1 0.7
cosine_recall@3 0.8414
cosine_recall@5 0.8786
cosine_recall@10 0.92
cosine_ndcg@10 0.813
cosine_mrr@10 0.7784
cosine_map@100 0.7817

Information Retrieval

Metric Value
cosine_accuracy@1 0.6914
cosine_accuracy@3 0.84
cosine_accuracy@5 0.8857
cosine_accuracy@10 0.9243
cosine_precision@1 0.6914
cosine_precision@3 0.28
cosine_precision@5 0.1771
cosine_precision@10 0.0924
cosine_recall@1 0.6914
cosine_recall@3 0.84
cosine_recall@5 0.8857
cosine_recall@10 0.9243
cosine_ndcg@10 0.8121
cosine_mrr@10 0.7758
cosine_map@100 0.7787

Information Retrieval

Metric Value
cosine_accuracy@1 0.69
cosine_accuracy@3 0.8286
cosine_accuracy@5 0.8729
cosine_accuracy@10 0.9143
cosine_precision@1 0.69
cosine_precision@3 0.2762
cosine_precision@5 0.1746
cosine_precision@10 0.0914
cosine_recall@1 0.69
cosine_recall@3 0.8286
cosine_recall@5 0.8729
cosine_recall@10 0.9143
cosine_ndcg@10 0.8041
cosine_mrr@10 0.7685
cosine_map@100 0.772

Information Retrieval

Metric Value
cosine_accuracy@1 0.67
cosine_accuracy@3 0.8171
cosine_accuracy@5 0.8657
cosine_accuracy@10 0.9071
cosine_precision@1 0.67
cosine_precision@3 0.2724
cosine_precision@5 0.1731
cosine_precision@10 0.0907
cosine_recall@1 0.67
cosine_recall@3 0.8171
cosine_recall@5 0.8657
cosine_recall@10 0.9071
cosine_ndcg@10 0.7905
cosine_mrr@10 0.7529
cosine_map@100 0.7567

Information Retrieval

Metric Value
cosine_accuracy@1 0.6314
cosine_accuracy@3 0.7943
cosine_accuracy@5 0.8386
cosine_accuracy@10 0.8829
cosine_precision@1 0.6314
cosine_precision@3 0.2648
cosine_precision@5 0.1677
cosine_precision@10 0.0883
cosine_recall@1 0.6314
cosine_recall@3 0.7943
cosine_recall@5 0.8386
cosine_recall@10 0.8829
cosine_ndcg@10 0.7591
cosine_mrr@10 0.7192
cosine_map@100 0.7236

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 311,351 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 9 tokens
    • mean: 20.47 tokens
    • max: 41 tokens
    • min: 7 tokens
    • mean: 46.65 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    What section from item 8 addresses financial information? Item 8 covers 'Financial Statements and Supplementary Data' relating to financial information.
    What was the percentage increase in interest income from 2022 to 2023? Interest income increased $769 million, or 259%, in the year ended December 31, 2023 as compared to the year ended December 31, 2022. This increase was primarily due to higher interest earned on our cash and cash equivalents and short-term investments in the year ended December 31, 2023 as compared to the prior year due to rising interest rates and our increasing portfolio balance.
    What was the operating margin for UnitedHealthcare in 2023? The operating margin for UnitedHealthcare in 2023 was reported as 5.8%.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

json

  • Dataset: json
  • Size: 700 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 700 samples:
    anchor positive
    type string string
    details
    • min: 7 tokens
    • mean: 20.59 tokens
    • max: 40 tokens
    • min: 6 tokens
    • mean: 47.59 tokens
    • max: 326 tokens
  • Samples:
    anchor positive
    What was the maximum borrowing capacity available from the Federal Home Loan Bank of Boston as of December 31, 2023? The maximum borrowing capacity available from the FHLBB as of December 31, 2023 was approximately $1.0 billion.
    What new compliance requirement was established by the CFPB's final rule issued on March 30, 2023, regarding small business credit applications? On March 30, 2023, the CFPB adopted a final rule requiring covered financial institutions, such as us, to collect and report data to the CFPB regarding certain small business credit applications.
    What potential impact could continued geopolitical tensions have on the business? While the ongoing Russia-Ukraine and Israel conflicts are still evolving and outcomes remain uncertain, the business does not expect the resulting challenging macroeconomic conditions to have a material impact currently. However, if conflicts continue or worsen, it could lead to greater disruptions and uncertainty, negatively impacting the business.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 10
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • fp16: True
  • tf32: False
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0.0658 10 12.7958 - - - - - -
0.1315 20 16.8225 - - - - - -
0.1973 30 20.1236 - - - - - -
0.2630 40 22.0845 - - - - - -
0.3288 50 19.7865 - - - - - -
0.3946 60 6.0102 - - - - - -
0.4603 70 3.7813 - - - - - -
0.5261 80 2.8675 - - - - - -
0.5919 90 2.2002 - - - - - -
0.6576 100 1.8334 - - - - - -
0.7234 110 1.5052 - - - - - -
0.7891 120 1.3454 - - - - - -
0.8549 130 1.2089 - - - - - -
0.9207 140 1.0615 - - - - - -
0.9864 150 1.011 - - - - - -
0.9996 152 - 0.2963 0.7043 0.7228 0.7462 0.6496 0.7566
1.0522 160 7.9844 - - - - - -
1.1180 170 12.726 - - - - - -
1.1837 180 17.3762 - - - - - -
1.2495 190 19.358 - - - - - -
1.3152 200 19.4805 - - - - - -
1.3810 210 5.7452 - - - - - -
1.4468 220 1.3857 - - - - - -
1.5125 230 0.9792 - - - - - -
1.5783 240 0.8632 - - - - - -
1.6441 250 0.8256 - - - - - -
1.7098 260 0.742 - - - - - -
1.7756 270 0.7307 - - - - - -
1.8413 280 0.7064 - - - - - -
1.9071 290 0.6492 - - - - - -
1.9729 300 0.6265 - - - - - -
1.9992 304 - 0.2345 0.7145 0.7317 0.7548 0.6706 0.7609
2.0386 310 4.0854 - - - - - -
2.1044 320 11.4485 - - - - - -
2.1702 330 14.1851 - - - - - -
2.2359 340 17.7422 - - - - - -
2.3017 350 19.2742 - - - - - -
2.3674 360 7.3918 - - - - - -
2.4332 370 1.0444 - - - - - -
2.4990 380 0.6947 - - - - - -
2.5647 390 0.6 - - - - - -
2.6305 400 0.6005 - - - - - -
2.6963 410 0.5314 - - - - - -
2.7620 420 0.5238 - - - - - -
2.8278 430 0.5207 - - - - - -
2.8935 440 0.5075 - - - - - -
2.9593 450 0.4673 - - - - - -
2.9988 456 - 0.2111 0.7252 0.7333 0.7530 0.6821 0.7617
3.0251 460 1.5162 - - - - - -
3.0908 470 10.5824 - - - - - -
3.1566 480 11.8184 - - - - - -
3.2224 490 16.3944 - - - - - -
3.2881 500 18.1591 - - - - - -
3.3539 510 10.8653 - - - - - -
3.4196 520 0.8936 - - - - - -
3.4854 530 0.5606 - - - - - -
3.5512 540 0.4724 - - - - - -
3.6169 550 0.4681 - - - - - -
3.6827 560 0.4334 - - - - - -
3.7485 570 0.4005 - - - - - -
3.8142 580 0.4224 - - - - - -
3.8800 590 0.4296 - - - - - -
3.9457 600 0.3788 - - - - - -
3.9984 608 - 0.1889 0.7345 0.7469 0.7647 0.6906 0.7633
4.0115 610 0.5548 - - - - - -
4.0773 620 8.6803 - - - - - -
4.1430 630 10.6235 - - - - - -
4.2088 640 14.5689 - - - - - -
4.2746 650 17.649 - - - - - -
4.3403 660 13.9682 - - - - - -
4.4061 670 0.7801 - - - - - -
4.4718 680 0.4848 - - - - - -
4.5376 690 0.4082 - - - - - -
4.6034 700 0.3883 - - - - - -
4.6691 710 0.3737 - - - - - -
4.7349 720 0.3485 - - - - - -
4.8007 730 0.3547 - - - - - -
4.8664 740 0.357 - - - - - -
4.9322 750 0.3223 - - - - - -
4.9979 760 0.3322 0.1843 0.7364 0.7482 0.7645 0.6911 0.7652
5.0637 770 6.5343 - - - - - -
5.1295 780 10.1093 - - - - - -
5.1952 790 13.3253 - - - - - -
5.2610 800 16.6724 - - - - - -
5.3268 810 15.6655 - - - - - -
5.3925 820 2.0319 - - - - - -
5.4583 830 0.4315 - - - - - -
5.5240 840 0.3544 - - - - - -
5.5898 850 0.3488 - - - - - -
5.6556 860 0.3301 - - - - - -
5.7213 870 0.3035 - - - - - -
5.7871 880 0.3123 - - - - - -
5.8529 890 0.3149 - - - - - -
5.9186 900 0.2857 - - - - - -
5.9844 910 0.3021 - - - - - -
5.9975 912 - 0.1704 0.7442 0.7527 0.7643 0.7031 0.7700
6.0501 920 4.5418 - - - - - -
6.1159 930 8.909 - - - - - -
6.1817 940 12.7023 - - - - - -
6.2474 950 15.6328 - - - - - -
6.3132 960 17.1026 - - - - - -
6.3790 970 3.8174 - - - - - -
6.4447 980 0.4035 - - - - - -
6.5105 990 0.3281 - - - - - -
6.5762 1000 0.3126 - - - - - -
6.6420 1010 0.304 - - - - - -
6.7078 1020 0.2692 - - - - - -
6.7735 1030 0.2807 - - - - - -
6.8393 1040 0.2993 - - - - - -
6.9051 1050 0.2721 - - - - - -
6.9708 1060 0.2674 - - - - - -
6.9971 1064 - 0.1596 0.7481 0.7607 0.7723 0.7074 0.7735
7.0366 1070 2.5499 - - - - - -
7.1023 1080 8.8274 - - - - - -
7.1681 1090 11.3224 - - - - - -
7.2339 1100 15.0825 - - - - - -
7.2996 1110 17.6647 - - - - - -
7.3654 1120 6.0271 - - - - - -
7.4312 1130 0.3838 - - - - - -
7.4969 1140 0.3137 - - - - - -
7.5627 1150 0.285 - - - - - -
7.6284 1160 0.2913 - - - - - -
7.6942 1170 0.268 - - - - - -
7.7600 1180 0.2643 - - - - - -
7.8257 1190 0.2702 - - - - - -
7.8915 1200 0.2775 - - - - - -
7.9573 1210 0.2563 - - - - - -
7.9967 1216 - 0.1543 0.7495 0.7645 0.7715 0.7124 0.7802
8.0230 1220 0.7657 - - - - - -
8.0888 1230 8.542 - - - - - -
8.1545 1240 9.9807 - - - - - -
8.2203 1250 14.3646 - - - - - -
8.2861 1260 16.877 - - - - - -
8.3518 1270 10.2992 - - - - - -
8.4176 1280 0.363 - - - - - -
8.4834 1290 0.304 - - - - - -
8.5491 1300 0.2851 - - - - - -
8.6149 1310 0.2853 - - - - - -
8.6806 1320 0.2676 - - - - - -
8.7464 1330 0.2522 - - - - - -
8.8122 1340 0.2619 - - - - - -
8.8779 1350 0.2757 - - - - - -
8.9437 1360 0.2528 - - - - - -
8.9963 1368 - 0.1483 0.7529 0.7680 0.7759 0.7172 0.7807
9.0095 1370 0.3564 - - - - - -
9.0752 1380 7.1402 - - - - - -
9.1410 1390 9.4364 - - - - - -
9.2067 1400 13.1391 - - - - - -
9.2725 1410 16.7827 - - - - - -
9.3383 1420 13.456 - - - - - -
9.4040 1430 0.5238 - - - - - -
9.4698 1440 0.3073 - - - - - -
9.5356 1450 0.2773 - - - - - -
9.6013 1460 0.2783 - - - - - -
9.6671 1470 0.2645 - - - - - -
9.7328 1480 0.2495 - - - - - -
9.7986 1490 0.2649 - - - - - -
9.8644 1500 0.2655 - - - - - -
9.9301 1510 0.2395 - - - - - -
9.9959 1520 0.2569 0.1453 0.7567 0.772 0.7787 0.7236 0.7817
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.2
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.3.0
  • Datasets: 2.19.1
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
10
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for viggypoker1/Vignesh-finetuned-bge2

Finetuned
(365)
this model

Evaluation results