BGE base Financial Matryoshka

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("shivamsharma1967/bge-base-financial-matryoshka")
# Run inference
sentences = [
    'Table of Contents\nAMAZON.COM, INC.\nCONSOLIDATED STATEMENTS OF OPERATIONS\n(in millions, except per share data)\n \n \nYear Ended December 31,\n \n2015\n \n2016\n \n2017\nNet product sales\n$\n79,268 $\n94,665 $\n118,573\nNet service sales\n27,738 \n41,322 \n59,293\nTotal net sales\n107,006 \n135,987 \n177,866\nOperating expenses:\n \n \n \nCost of sales\n71,651 \n88,265 \n111,934\nFulfillment\n13,410 \n17,619 \n25,249\nMarketing\n5,254 \n7,233 \n10,069\nTechnology and content\n12,540 \n16,085 \n22,620\nGeneral and administrative\n1,747 \n2,432 \n3,674\nOther operating expense, net\n171 \n167 \n214\nTotal operating expenses\n104,773 \n131,801 \n173,760\nOperating income\n2,233 \n4,186 \n4,106\nInterest income\n50 \n100 \n202\nInterest expense\n(459) \n(484) \n(848)\nOther income (expense), net\n(256) \n90 \n346\nTotal non-operating income (expense)\n(665) \n(294) \n(300)\nIncome before income taxes\n1,568 \n3,892 \n3,806\nProvision for income taxes\n(950) \n(1,425) \n(769)\nEquity-method investment activity, net of tax\n(22) \n(96) \n(4)\nNet income\n$\n596 $\n2,371 $\n3,033\nBasic earnings per share\n$\n1.28 $\n5.01 $\n6.32\nDiluted earnings per share\n$\n1.25 $\n4.90 $\n6.15\nWeighted-average shares used in computation of earnings per share:\n \n \n \nBasic\n467 \n474 \n480\nDiluted\n477 \n484 \n493\nSee accompanying notes to consolidated financial statements.\n38\nTable of Contents\nAMAZON.COM, INC.\nCONSOLIDATED STATEMENTS OF OPERATIONS\n(in millions, except per share data)\n \n \nYear Ended December 31,\n \n2015\n \n2016\n \n2017\nNet product sales\n$\n79,268 $\n94,665 $\n118,573\nNet service sales\n27,738 \n41,322 \n59,293\nTotal net sales\n107,006 \n135,987 \n177,866\nOperating expenses:\n \n \n \nCost of sales\n71,651 \n88,265 \n111,934\nFulfillment\n13,410 \n17,619 \n25,249\nMarketing\n5,254 \n7,233 \n10,069\nTechnology and content\n12,540 \n16,085 \n22,620\nGeneral and administrative\n1,747 \n2,432 \n3,674\nOther operating expense, net\n171 \n167 \n214\nTotal operating expenses\n104,773 \n131,801 \n173,760\nOperating income\n2,233 \n4,186 \n4,106\nInterest income\n50 \n100 \n202\nInterest expense\n(459) \n(484) \n(848)\nOther income (expense), net\n(256) \n90 \n346\nTotal non-operating income (expense)\n(665) \n(294) \n(300)\nIncome before income taxes\n1,568 \n3,892 \n3,806\nProvision for income taxes\n(950) \n(1,425) \n(769)\nEquity-method investment activity, net of tax\n(22) \n(96) \n(4)\nNet income\n$\n596 $\n2,371 $\n3,033\nBasic earnings per share\n$\n1.28 $\n5.01 $\n6.32\nDiluted earnings per share\n$\n1.25 $\n4.90 $\n6.15\nWeighted-average shares used in computation of earnings per share:\n \n \n \nBasic\n467 \n474 \n480\nDiluted\n477 \n484 \n493\nSee accompanying notes to consolidated financial statements.\n38',
    "What is Amazon's year-over-year change in revenue from FY2016 to FY2017 (in units of percents and round to one decimal place)? Calculate what was asked by utilizing the line items clearly shown in the statement of income.",
    'What is the FY2018 - FY2020 3 year average of capex as a % of revenue for MGM Resorts? Answer in units of percents and round to one decimal place. Please utilize information provided primarily within the statement of cash flows and the statement of income.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric dim_768 dim_512 dim_256 dim_128 dim_64
cosine_accuracy@1 0.4 0.2667 0.2 0.2 0.2667
cosine_accuracy@3 0.4667 0.4667 0.4 0.3333 0.2667
cosine_accuracy@5 0.5333 0.5333 0.4 0.4 0.3333
cosine_accuracy@10 0.6667 0.6667 0.6 0.5333 0.4667
cosine_precision@1 0.4 0.2667 0.2 0.2 0.2667
cosine_precision@3 0.1556 0.1556 0.1333 0.1111 0.0889
cosine_precision@5 0.1067 0.1067 0.08 0.08 0.0667
cosine_precision@10 0.0667 0.0667 0.06 0.0533 0.0467
cosine_recall@1 0.4 0.2667 0.2 0.2 0.2667
cosine_recall@3 0.4667 0.4667 0.4 0.3333 0.2667
cosine_recall@5 0.5333 0.5333 0.4 0.4 0.3333
cosine_recall@10 0.6667 0.6667 0.6 0.5333 0.4667
cosine_ndcg@10 0.5029 0.4537 0.374 0.346 0.3413
cosine_mrr@10 0.4541 0.3874 0.3051 0.2883 0.304
cosine_map@100 0.467 0.4024 0.3253 0.306 0.322

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 135 training samples
  • Columns: positive and anchor
  • Approximate statistics based on the first 135 samples:
    positive anchor
    type string string
    details
    • min: 359 tokens
    • mean: 508.73 tokens
    • max: 512 tokens
    • min: 11 tokens
    • mean: 39.7 tokens
    • max: 175 tokens
  • Samples:
    positive anchor
    Twelve Months Ended June 30, 2022
    Twelve Months Ended June 30, 2023
    ($ million)
    EBITDA
    EBIT
    Net
    Income
    EPS
    (Diluted
    US
    cents)(1)
    EBITDA
    EBIT
    Net
    Income
    EPS
    (Diluted
    US
    cents)(1)
    Net income attributable to Amcor

    805

    805

    805

    52.9

    1,048

    1,048

    1,048

    70.5
    Net income attributable to non-controlling
    interests

    10

    10

    10

    10
    Tax expense

    300

    300

    193

    193
    Interest expense, net

    135

    135

    259

    259
    Depreciation and amortization

    579

    569
    EBITDA, EBIT, Net income and EPS

    1,829

    1,250

    805

    52.9

    2,080

    1,510

    1,048

    70.5
    2019 Bemis Integration Plan

    37

    37

    37

    2.5








    Net loss on disposals(2)

    10

    10

    10

    0.7








    Impact of hyperinflation

    16

    16

    16

    1.0

    24

    24

    24

    1.9
    Property and other losses, net(3)

    13

    13

    13

    0.8

    2

    2

    2

    0.1
    Russia-Ukraine conflict impacts(4)

    200

    200

    200

    13.2

    (90)
    (90)
    (90)
    (6.0)
    Pension settlements

    8...
    What Was AMCOR's Adjusted Non GAAP EBITDA for FY 2023
    SQUARE,INC.
    CONSOLIDATEDBALANCESHEETS
    (In thousands, except share and per share data)

    December31,

    2016

    2015
    Assets


    Currentassets:


    Cashandcashequivalents
    $
    452,030 $
    461,329
    Short-terminvestments
    59,901

    Restrictedcash
    22,131
    13,537
    Settlementsreceivable
    321,102
    142,727
    Customerfundsheld
    43,574
    9,446
    Loansheldforsale
    42,144
    604
    Merchantcashadvancereceivable,net
    4,212
    36,473
    Othercurrentassets
    56,331
    41,447
    Totalcurrentassets
    1,001,425
    705,563
    Propertyandequipment,net
    88,328
    87,222
    Goodwill
    57,173
    56,699
    Acquiredintangibleassets,net
    19,292
    26,776
    Long-terminvestments
    27,366

    Restrictedcash
    14,584
    14,686
    Otherassets
    3,194
    3,826
    Totalassets
    $
    1,211,362 $
    894,772
    LiabilitiesandStockholdersEquity


    Currentliabilities:


    Accountspayable
    $
    12,602 $
    18,869
    Customerspayable
    388,058
    215,365
    Customerfundsobligation
    43,574
    9,446
    Accruedtransactionlosses
    20,064
    17,176
    Accruedexpenses
    39,543
    44,401
    Othercurrentliabilities
    73,623
    28,945
    Totalcurrentliabilities
    577,464
    33...
    Considering the data in the balance sheet, what is Block's (formerly known as Square) FY2016 working capital ratio? Define working capital ratio as total current assets divided by total current liabilities. Round your answer to two decimal places.
    Consolidated Balance Sheets
    Verizon Communications Inc. and Subsidiaries
    (dollars in millions, except per share amounts)
    At December 31,
    2022
    2021
    Assets
    Current assets
    Cash and cash equivalents
    $
    2,605
    $
    2,921
    Accounts receivable

    25,332

    24,742
    Less Allowance for credit losses

    826

    896
    Accounts receivable, net

    24,506

    23,846
    Inventories

    2,388

    3,055
    Prepaid expenses and other

    8,358

    6,906
    Total current assets

    37,857

    36,728
    Property, plant and equipment

    307,689

    289,897
    Less Accumulated depreciation

    200,255

    190,201
    Property, plant and equipment, net

    107,434

    99,696
    Investments in unconsolidated businesses

    1,071

    1,061
    Wireless licenses

    149,796

    147,619
    Goodwill

    28,671

    28,603
    Other intangible assets, net

    11,461

    11,677
    Operating lease right-of-use assets

    26,130

    27,883
    Other assets

    17,260

    13,329
    Total assets
    $
    379,680
    $
    366,596
    Liabilities and Equity
    Current liabilities
    Debt maturing within o...
    Does Verizon have a reasonably healthy liquidity profile based on its quick ratio for FY 2022? If the quick ratio is not relevant to measure liquidity, please state that and explain why.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: False
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
0 0 0.5029 0.4537 0.374 0.346 0.3413
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
1
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for shivamsharma1967/bge-base-financial-matryoshka

Finetuned
(365)
this model

Evaluation results