SentenceTransformer based on Snowflake/snowflake-arctic-embed-l

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-l
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("shivXy/ot-midterm-v0")
# Run inference
sentences = [
    '1. What was the focus of the pilot study mentioned regarding tendinosis of the Achilles tendon?',
    'tendinosis of the Achilles tendon: a pilot study. AJR Am J Roentgenol . \n2007;189:W215–W220 . \n74. van Leeuwen WF , Janssen SJ , Ring D , Chen N . Incidental magnetic resonance \nimaging signal changes in the extensor carpi radialis brevis origin are more \ncommon with age. J Shoulder Elbow Surg . 2016;25:1175–1181 . \n75. Rabago D , Lee KS , Ryan M , et al. Hypertonic dextrose and morrhuate sodium \ninjections (prolotherapy) for lateral epicondylosis (tennis elbow): results of a \nsingle-blind, pilot-level, randomized controlled trial. Am J Phys Med Rehabil . \n2013;92:587–596 . \n76. Scarpone M , Rabago DP , Zgierska A , Arbogast G , Snell E . The efficacy \nof prolotherapy for lateral epicondylosis: a pilot study. Clin J Sport Med .',
    '179. Dick FD , Graveling RA , Munro W , Walker-Bone K . Workplace management of \nupper limb disorders: a systematic review. Occup Med (Lond) . 2011;61:19–25 . \n180. Buchanan H , Van Niekerk L , Grimmer K . Work transition after hand injury: a \nscoping review. J Hand Ther . 2020 . \n181. Rost KA , Alvero AM . Participatory approaches to workplace safety man- \nagement: bridging the gap between behavioral safety and participatory er- \ngonomics. Int J Occup Saf Ergon . 2020;26:194–203 . \n182. Bernardes JM , Ruiz-Frutos C , Moro ARP , Dias A . A low-cost and efficient par- \nticipatory ergonomic intervention to reduce the burden of work-related mus- \nculoskeletal disorders in an industrially developing country: an experience re-',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.9545
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.9545
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.9545
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9832
cosine_mrr@10 0.9773
cosine_map@100 0.9773

Training Details

Training Dataset

Unnamed Dataset

  • Size: 812 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 812 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 12 tokens
    • mean: 24.85 tokens
    • max: 60 tokens
    • min: 8 tokens
    • mean: 158.41 tokens
    • max: 320 tokens
  • Samples:
    sentence_0 sentence_1
    2. What type of intervention is being compared to strength training in the study protocol by Sundstrup E and colleagues? ment for work-related lateral epicondylitis. Work . 2010;37:81–86 .
    161. Parimalam P , Premalatha MR , Padmini DS , Ganguli AK . Participatory er-
    gonomics in redesigning a dyeing tub for fabric dyers. Work . 2012;43:453–458 .
    162. Harari D , Casarotto RA . Effectiveness of a multifaceted intervention to manage
    musculoskeletal disorders in workers of a medium-sized company. Int J Occup
    Saf Ergon . 2021;27:247–257 .
    163. Sundstrup E , Jakobsen MD , Andersen CH , et al. Participatory ergonomic inter-
    vention versus strength training on chronic pain and work disability in slaugh-
    terhouse workers: study protocol for a single-blind, randomized controlled
    trial. BMC Musculoskelet Disord . 2013;14:67 .
    2. What does the increased signal intensity in the proximal portion of the lateral collateral ligament suggest about the patient's condition? 266 C.W. Stegink-Jansen, J.G. Bynum, A.L. Lambropoulos et al. / Journal of Hand Therapy 34 (2021) 263–297
    Fig. 3. Pathology. A 60-year-old female with right elbow pain for 5 weeks. (A) Coronal fat-suppressed FSE T2-weighted image showing mild thickening of the proximal
    portion of the common extensor tendon with increased signal intensity (arrow), suggesting mild injury. Irregular thickening with increased signal intensity in the proximal
    portion of the lateral collateral ligament (arrowhead) is also noted, suggesting mild injury. (B) and (C) Coronal PD FSE image and oblique radiograph showing cortical
    2. What factors are assessed in relation to lower extremity (LE) issues according to the systematic review? Manufacturing:
    • Electronics
    • Auto parts
    • Windows
    • Cabinets
    • Medical equipment
    • Fitness equipment
    Healthcare (excluding direct
    patient care):
    • Hospitals
    • Health research
    Worker: structured interviews,
    physical examinations
    Environment: workplace walk
    through
    Hazards in work: individual
    assessments of biomechanical and
    psychosocial factors
    LE related to frequency of forceful
    exertions or forearm supination and
    forceful lifting; increased odds of LE
    related to being age 36-50, female,
    or a smoker; high social support
    appeared protective against LE
    van Rijn et al. 2009 Assess relationship between
    work-related physical factors,
    psychosocial factors, and LE
    Systematic review
    13 studies:
    • 9 cross-sectional
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss cosine_ndcg@10
0.6098 50 - 1.0
1.0 82 - 0.9773
1.2195 100 - 0.9664
1.8293 150 - 1.0
2.0 164 - 0.9832
2.4390 200 - 0.9832
3.0 246 - 0.9832
3.0488 250 - 0.9832
3.6585 300 - 0.9832
4.0 328 - 0.9832
4.2683 350 - 0.9832
4.8780 400 - 0.9832
5.0 410 - 0.9832
5.4878 450 - 0.9832
6.0 492 - 0.9832
6.0976 500 0.6578 0.9832
6.7073 550 - 0.9664
7.0 574 - 0.9664
7.3171 600 - 0.9664
7.9268 650 - 0.9664
8.0 656 - 0.9664
8.5366 700 - 0.9832
9.0 738 - 0.9832
9.1463 750 - 0.9832
9.7561 800 - 0.9832
10.0 820 - 0.9832

Framework Versions

  • Python: 3.13.2
  • Sentence Transformers: 3.4.1
  • Transformers: 4.49.0
  • PyTorch: 2.6.0+cpu
  • Accelerate: 1.4.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
82
Safetensors
Model size
334M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for shivXy/ot-midterm-v0

Finetuned
(83)
this model

Evaluation results