metadata
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:6300
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: BAAI/bge-base-en-v1.5
widget:
- source_sentence: >-
The Health Services segment's revenues are primarily generated from the
sale and managing of prescription drugs to eligible members in benefit
plans maintained by clients.
sentences:
- >-
What online platforms does The Home Depot operate for its product
offerings?
- >-
How does the Company's Health Services segment generate most of its
revenue?
- What are the various diversity, equity, and inclusion councils at AMC?
- source_sentence: Product gross margin decreased to 75.9% in 2023, compared to 2022.
sentences:
- What organizations do the cybersecurity leaders hold memberships in?
- How much did product gross margin decrease from 2022 to 2023?
- What items are included under Item 8 in the financial report?
- source_sentence: >-
Our apparel assortment includes items such as pants, shorts, tops, and
jackets designed for a healthy lifestyle including athletic activities
such as yoga, running, training, and most other activities.
sentences:
- What types of activities are lululemon's athletic apparels designed for?
- >-
How are the capital adequacy ratios of American Express and AENB
determined under current federal banking regulations?
- What was the company's worldwide effective income tax rate in 2023?
- source_sentence: >-
Rest of Asia Pacific net sales increased 1% or $240 million during 2023
compared to 2022. The weakness in foreign currencies relative to the U.S.
dollar had a significantly unfavorable year-over-year impact on Rest of
Asia Pacific net sales. The net sales increase consisted of higher net
sales of iPhone and Services, partially offset by lower net sales of Mac
and iPad.
sentences:
- >-
What primarily drove the increase in service fees and other revenue
associated with Card Member cross-currency spending?
- >-
What was the balance of net cash used in financing activities for Costco
for the 52 weeks ended August 28, 2022?
- How did Rest of Asia Pacific net sales change in 2023 compared to 2022?
- source_sentence: >-
Cardiovascular/Metabolism/Other products sales were $3.7 billion, a
decline of 5.5% as compared to the prior year.
sentences:
- >-
What was the revenue decline percentage for
Cardiovascular/Metabolism/Other products in 2023?
- How is a membership's territory determined according to the description?
- What informs the ESG disclosures mentioned in the text?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: BGE base Financial Matryoshka
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 768
type: dim_768
metrics:
- type: cosine_accuracy@1
value: 0.68
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8185714285714286
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8614285714285714
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9057142857142857
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.68
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.27285714285714285
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.17228571428571426
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09057142857142855
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.68
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.8185714285714286
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.8614285714285714
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9057142857142857
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.7935961173570086
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7574988662131514
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.761336548534802
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 512
type: dim_512
metrics:
- type: cosine_accuracy@1
value: 0.6742857142857143
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.81
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8585714285714285
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9128571428571428
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.6742857142857143
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.27
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1717142857142857
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09128571428571428
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.6742857142857143
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.81
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.8585714285714285
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9128571428571428
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.7923231259157938
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7537448979591835
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.7569742717128601
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 256
type: dim_256
metrics:
- type: cosine_accuracy@1
value: 0.6728571428571428
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8028571428571428
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.85
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9042857142857142
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.6728571428571428
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.2676190476190476
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.16999999999999998
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09042857142857141
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.6728571428571428
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.8028571428571428
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.85
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.9042857142857142
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.7878292403814693
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7507125850340136
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.7544072051783587
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 128
type: dim_128
metrics:
- type: cosine_accuracy@1
value: 0.6642857142857143
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8442857142857143
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.8957142857142857
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.6642857142857143
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.26666666666666666
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.16885714285714284
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.08957142857142855
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.6642857142857143
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.8
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.8442857142857143
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.8957142857142857
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.7803597349541593
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7434574829931975
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.747244951206714
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 64
type: dim_64
metrics:
- type: cosine_accuracy@1
value: 0.6414285714285715
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.7871428571428571
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8185714285714286
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.8728571428571429
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.6414285714285715
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.2623809523809524
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1637142857142857
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.08728571428571427
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.6414285714285715
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.7871428571428571
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.8185714285714286
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.8728571428571429
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.7591433790735056
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7226213151927439
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.7273040677676369
name: Cosine Map@100
BGE base Financial Matryoshka
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-base-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- json
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("JulioSanchezD/bge-base-financial-matryoshka")
# Run inference
sentences = [
'Cardiovascular/Metabolism/Other products sales were $3.7 billion, a decline of 5.5% as compared to the prior year.',
'What was the revenue decline percentage for Cardiovascular/Metabolism/Other products in 2023?',
"How is a membership's territory determined according to the description?",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Datasets:
dim_768
,dim_512
,dim_256
,dim_128
anddim_64
- Evaluated with
InformationRetrievalEvaluator
Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
---|---|---|---|---|---|
cosine_accuracy@1 | 0.68 | 0.6743 | 0.6729 | 0.6643 | 0.6414 |
cosine_accuracy@3 | 0.8186 | 0.81 | 0.8029 | 0.8 | 0.7871 |
cosine_accuracy@5 | 0.8614 | 0.8586 | 0.85 | 0.8443 | 0.8186 |
cosine_accuracy@10 | 0.9057 | 0.9129 | 0.9043 | 0.8957 | 0.8729 |
cosine_precision@1 | 0.68 | 0.6743 | 0.6729 | 0.6643 | 0.6414 |
cosine_precision@3 | 0.2729 | 0.27 | 0.2676 | 0.2667 | 0.2624 |
cosine_precision@5 | 0.1723 | 0.1717 | 0.17 | 0.1689 | 0.1637 |
cosine_precision@10 | 0.0906 | 0.0913 | 0.0904 | 0.0896 | 0.0873 |
cosine_recall@1 | 0.68 | 0.6743 | 0.6729 | 0.6643 | 0.6414 |
cosine_recall@3 | 0.8186 | 0.81 | 0.8029 | 0.8 | 0.7871 |
cosine_recall@5 | 0.8614 | 0.8586 | 0.85 | 0.8443 | 0.8186 |
cosine_recall@10 | 0.9057 | 0.9129 | 0.9043 | 0.8957 | 0.8729 |
cosine_ndcg@10 | 0.7936 | 0.7923 | 0.7878 | 0.7804 | 0.7591 |
cosine_mrr@10 | 0.7575 | 0.7537 | 0.7507 | 0.7435 | 0.7226 |
cosine_map@100 | 0.7613 | 0.757 | 0.7544 | 0.7472 | 0.7273 |
Training Details
Training Dataset
json
- Dataset: json
- Size: 6,300 training samples
- Columns:
positive
andanchor
- Approximate statistics based on the first 1000 samples:
positive anchor type string string details - min: 11 tokens
- mean: 45.6 tokens
- max: 288 tokens
- min: 9 tokens
- mean: 20.54 tokens
- max: 46 tokens
- Samples:
positive anchor Operating Expenses Our operating expenses consisted of the following:
Year Ended December 31, Increases in yield, discount rate, capitalization rate or duration used in the valuation of level 3 investments would have resulted in a lower fair value measurement, while increases in recovery rate or multiples would have resulted in a higher fair value measurement as of both December 2023 and December 2022.
What was the impact on the fair value measurement of level 3 investments when the yield, discount rate, and capitalization rate were increased?
At December 31, 2023, Ford Credit’s liquidity sources, including cash, committed asset-backed facilities, and unsecured credit facilities, totaled $56.2 billion, up $5.2 billion from year-end 2022.
What sources contribute to Ford Credit’s liquidity as of December 31, 2023, and what was their total value?
- Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 32per_device_eval_batch_size
: 16gradient_accumulation_steps
: 16learning_rate
: 2e-05num_train_epochs
: 4lr_scheduler_type
: cosinewarmup_ratio
: 0.1bf16
: Truetf32
: Trueload_best_model_at_end
: Trueoptim
: adamw_torch_fusedbatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 16eval_accumulation_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: cosinelr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Truelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
---|---|---|---|---|---|---|---|
0.8122 | 10 | 1.5473 | - | - | - | - | - |
0.9746 | 12 | - | 0.7821 | 0.7814 | 0.7723 | 0.7543 | 0.7229 |
1.6244 | 20 | 0.6848 | - | - | - | - | - |
1.9492 | 24 | - | 0.7906 | 0.7877 | 0.7824 | 0.7729 | 0.7519 |
2.4365 | 30 | 0.5164 | - | - | - | - | - |
2.9239 | 36 | - | 0.7921 | 0.7924 | 0.7887 | 0.7778 | 0.7587 |
3.2487 | 40 | 0.4455 | - | - | - | - | - |
3.8985 | 48 | - | 0.7936 | 0.7923 | 0.7878 | 0.7804 | 0.7591 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.12.9
- Sentence Transformers: 3.4.1
- Transformers: 4.41.2
- PyTorch: 2.6.0+cu126
- Accelerate: 1.4.0
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}