ot-midterm-v0 / README.md
shivXy's picture
Add new SentenceTransformer model
f54f71c verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:812
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-l
widget:
  - source_sentence: >-
      1. What ergonomic factors contributed to the incidence of lateral and
      medial epicondylitis among workers?
    sentences:
      - |-
        ble, science-based suggestions for alterations during work-related, 
        musical performance, or sporting activities. 
        Use of the NIH framework of behavioral study designs provided 
        opportunity to examine the progression of development and bene- 
        fits and measurement used in ergonomic studies. 25 
        Direct measurements to confirm the effectiveness and accuracy 
        of implemented ergonomic adaptations were found in stage 0 
        and stage 1 studies. In stage 0, the basic science and proof of 
        principle stage, intervention development consistently included 
        measurements of motion and/or muscle activation measurements. 
        Visualizations of ergonomic design implementations or visual- 
        izations of exercises applied in the work setting were shown in
      - >-
        47. Ljung BO , Forsgren S , Fridén J . Substance P and calcitonin
        gene-related peptide 

        expression at the extensor carpi radialis brevis muscle origin:
        implications for 

        the etiology of tennis elbow. J Orthop Res . 1999;17:554–559 . 

        48. Alfredson H , Ljung BO , Thorsen K , Lorentzon R . In vivo
        investigation of ECRB 

        tendons with microdialysis technique–no signs of inflammation but high 

        amounts of glutamate in tennis elbow. Acta Orthop Scand . 20 0
        0;71:475–479 . 

        49. Nirschl RP . Lateral epicondylitis/tendinosis. In: Morrey BF,
        Sanchez-Sotelo J, 

        Morrey ME, eds. The Elbow and Its Disorders Elsevier; 2018:574–581 . 

        50. Cha YK , Kim SJ , Park NH , Kim JY , Kim JH , Park JY . Magnetic
        resonance
      - |-
        shoulder, 21 of 88 elbow (19 lateral 
        epicondylitis, 2 medial epicondylitis), 19 of 
        88 wrist, 13 of 88 fingers 51.1% could not 
        return to the same job 
        Tool: intolerance to vibration still present 
        after 4 Y, especially for women 
        Ergonomic factors: poor seating, reaching, 
        postures, and bad tool design 
        Interventions: decreased incidence rate of 
        LE from 2.1 to 0.1 
        Smith 2001 Demonstrate the impact of a 
        concern by one worker to 
        raise awareness of 
        departmental problems, 
        leading to general packaging 
        improvements made by the 
        supplier of seals of vacutainer 
        needles 
        Case report 
        A phlebotomist with LE due to 
        forceful gripping and 
        repetitive twisting of seal on 
        vacutainer needles 
        Rest: temporary rest from job
  - source_sentence: >-
      1. What were the effects of exercise compared to ergonomics on pain levels
      in the shoulder, elbows, and hand/wrists?
    sentences:
      - |-
        palpation, maximal muscle 
        strength of arm and hand, 
        function of arm and hand, 
        Disability of the Arm, Shoulder, 
        and Hand questionnaire 
        Pain decreased more in the exercise versus 
        the ergonomics group in the shoulder, 
        elbows and hand/wrists. The DASH scores 
        increased in the ergonomics group, but 
        decreased in the strengthening group. 
        Number of subjects with improved results 
        were significantly higher in the exercise 
        group as compared to the ergonomics 
        group. 
        Results can be generalized to adults with 
        upper limb chronic pain exposed to highly 
        repetitive and forceful manual work 
        Soler-Font 2019 Compare effects of 3 types of 
        prevention interventions on 
        pain and work functioning 
        between nursing and control 
        group
      - |-
        pose, study design and methods, study sample, intervention or ex- 
        posures (when appropriate), type of measurement and results or 
        outcomes. The detailed information in Tables 1 through 4 are listed 
        in the appendix. A synthesis table, Table 5, was constructed to in- 
        tegrate the biomechanical and population exposure results into a 
        set of defined harm reducing recommendations, and is presented 
        in the results section. 
        Results 
        Tissue involvement and tissue-level interventions 
        An anatomical landscape of the lateral elbow 
        An in-depth description of elbow anatomy is provided by Mor- 
        rey, 29 who describes the architecture of the osteology, elbow joint 
        structures including their joint capsules and ligaments, bursae, ves-
      - |-
        strength, and functional scores at 6 months when compared to use 
        of a wrist orthosis and corticosteroid injections. 84 
        Various types of interventions are used to alter or lessen the 
        load on impacted tissues. The benefit of unloading or alteration of 
        the area of loading is based on the hypothesis that loading of a 
        painful tendon perpetuates nociceptive stimuli, and that the sec- 
        ondary hyperalgesia in tendinopathy is a response to ongoing no- 
        ciception. 38 Counterforce bracing has been shown to alter force
  - source_sentence: >-
      1. What are the differences in strength and efficiency between the ECRB
      and ECRL muscles during isometric and dynamic conditions?
    sentences:
      - |-
        about twice as great for the ECRB compared to 
        the ECRL muscle. 
        ECRB and ECRL are efficient synergists; the ECRB 
        is stronger isometrically, the ECRL becomes the 
        stronger muscle as angular velocity increases. The 
        synergy of the ECRB and ECRL takes 30% less 
        mass in comparison to if one muscle would 
        generate the force. 
        Ljung 1999 To measure sarcomere length change in 
        the ECRB muscle during ulnar deviation 
        with the wrist in both the neutral and 
        pronated position. 
        Repeated measures of sarcomere length 
        changes in 4 conditions: (1) wrist in 
        neutral (in radial-ulnar and forearm in 
        neutral rotation), (2) forearm neutral 
        rotation + wrist in ulnar deviation, (3) 
        wrist in neutral + forearm in pronation,
      - |-
        oping LE. A high exposure (RSI > 5), older age, and self-perceived 
        poor general health were associated with incidence of LE. 133
      - |-
        lessen excursion of pronation and supination. For 
        instance, change tennis technique to balance foot 
        support, trunk rotation, and arm contribution to the 
        force application to the racket, instead of getting the 
        force and speed mostly out of the arm. 
        Higher force production takes place during 
        eccentric contractions for ECRB and ECRL. 
        Repetitive movement with eccentric wrist 
        extensor contractions 
        • Avoid or modify activities that require full-length 
        weighted stretch of ECRL and ECRB. 
        • Seek design solutions so that objects do not need to 
        be lowered manually at a fast rate. 
        The synergy of the ECRL and ERCB is a useful 
        mechanism for optimal function at higher 
        angular velocities. 
        High rapid rate of force development,
  - source_sentence: >-
      1. What type of trial was conducted to compare corticoid and saline
      treatments in the study published in the Am J Sports Med in 2013?
    sentences:
      - |-
        lessen excursion of pronation and supination. For 
        instance, change tennis technique to balance foot 
        support, trunk rotation, and arm contribution to the 
        force application to the racket, instead of getting the 
        force and speed mostly out of the arm. 
        Higher force production takes place during 
        eccentric contractions for ECRB and ECRL. 
        Repetitive movement with eccentric wrist 
        extensor contractions 
        • Avoid or modify activities that require full-length 
        weighted stretch of ECRL and ECRB. 
        • Seek design solutions so that objects do not need to 
        be lowered manually at a fast rate. 
        The synergy of the ECRL and ERCB is a useful 
        mechanism for optimal function at higher 
        angular velocities. 
        High rapid rate of force development,
      - |-
        histology 
        Controls: tendon with tendinous plate (aponeurosis) 
        Extensor carpi radialis brevis (ECRB) degeneration 
        occurs with age > 50 (not the peak years for LE), 
        therefore age > 50 not a factor for LE 
        LE: edema in aponeurotic tissues, underneath 
        aponeurosis granulation tissue, with fibrosis, and 
        free nerve endings (pain), hypervascularization of 
        the aponeurosis 
        Ljung Forsgren, Friden 
        1990 
        To describe substance P and calcitonin 
        gene-related peptide (CGRP) in 
        patients with LE and healthy 
        subjects 
        Cross-sectional cohort comparison of 6 
        patients intra-operatively, and biopsies 
        of 6 healthy volunteers 
        Specimens from patients included area 
        close to the bone, but area close to the 
        bone was not included in healthy 
        subjects
      - >-
        corticoid, or saline: a randomized, double-blind, placebo-controlled
        trial. Am J 

        Sports Med . 2013;41:625–635 . 

        193. Houck DA , Kraeutler MJ , Thornton LB , McCarty EC , Bravman JT .T
        r e a t m e n t of 

        lateral epicondylitis with autologous blood, platelet-rich plasma, or
        corticos- 

        teroid injections: a systematic review of overlapping meta-analyses.
        Orthop J 

        Sports Med . 2019;7 . 

        194. Li A , Wang H , Yu Z , et al. Platelet-rich plasma vs
        corticosteroids for elbow 

        epicondylitis: A systematic review and meta-analysis. Medicine
        (Baltimore) . 

        2019;98:e18358 . 

        195. Arirachakaran A , Sukthuayat A , Sisayanarane T ,
        Laoratanavoraphong S , Kan- 

        chanatawan W , Kongtharvonskul J . Platelet-rich plasma versus
        autologous
  - source_sentence: >-
      1. What was the focus of the pilot study mentioned regarding tendinosis of
      the Achilles tendon?
    sentences:
      - >-
        C.W. Stegink-Jansen, J.G. Bynum, A.L. Lambropoulos et al. / Journal of
        Hand Therapy 34 (2021) 263–297 285 

        Table 3 ( continued ) 

        AUTHOR/YEAR PURPOSE DESIGN SUBJECTS EXPOSURES EXPOSURE MEASUREMENTS
        RESULTS 

        Herquelot et al. 2013b 

        Scand J Work Environ 

        Health 39(6):578-588 

        Estimate association between 

        occupational risk factors and 

        incidence of LE 

        Cohort study (cross-sectional and 

        incidence) 

        3710 workers; 1046 completed 

        follow-up 

        Repetition, physical exertion, arm 

        movements 

        Worker: health assessment 

        Hazards in work: self-reported 

        Repetitive tasks and high physical 

        exertion with elbow movements 

        contributed to incidence of LE 

        Nordander et al. 2013 Explore relationships between 

        occupational risk factors and
      - >-
        179. Dick FD , Graveling RA , Munro W , Walker-Bone K . Workplace
        management of 

        upper limb disorders: a systematic review. Occup Med (Lond) .
        2011;61:19–25 . 

        180. Buchanan H , Van Niekerk L , Grimmer K . Work transition after hand
        injury: a 

        scoping review. J Hand Ther . 2020 . 

        181. Rost KA , Alvero AM . Participatory approaches to workplace safety
        man- 

        agement: bridging the gap between behavioral safety and participatory
        er- 

        gonomics. Int J Occup Saf Ergon . 2020;26:194–203 . 

        182. Bernardes JM , Ruiz-Frutos C , Moro ARP , Dias A . A low-cost and
        efficient par- 

        ticipatory ergonomic intervention to reduce the burden of work-related
        mus- 

        culoskeletal disorders in an industrially developing country: an
        experience re-
      - >-
        tendinosis of the Achilles tendon: a pilot study. AJR Am J Roentgenol . 

        2007;189:W215–W220 . 

        74. van Leeuwen WF , Janssen SJ , Ring D , Chen N . Incidental magnetic
        resonance 

        imaging signal changes in the extensor carpi radialis brevis origin are
        more 

        common with age. J Shoulder Elbow Surg . 2016;25:1175–1181 . 

        75. Rabago D , Lee KS , Ryan M , et al. Hypertonic dextrose and
        morrhuate sodium 

        injections (prolotherapy) for lateral epicondylosis (tennis elbow):
        results of a 

        single-blind, pilot-level, randomized controlled trial. Am J Phys Med
        Rehabil . 

        2013;92:587–596 . 

        76. Scarpone M , Rabago DP , Zgierska A , Arbogast G , Snell E . The
        efficacy 

        of prolotherapy for lateral epicondylosis: a pilot study. Clin J Sport
        Med .
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.9545454545454546
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9545454545454546
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.33333333333333326
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.20000000000000007
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.10000000000000003
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.9545454545454546
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9832240797077936
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9772727272727273
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9772727272727273
            name: Cosine Map@100

SentenceTransformer based on Snowflake/snowflake-arctic-embed-l

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-l
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("shivXy/ot-midterm-v0")
# Run inference
sentences = [
    '1. What was the focus of the pilot study mentioned regarding tendinosis of the Achilles tendon?',
    'tendinosis of the Achilles tendon: a pilot study. AJR Am J Roentgenol . \n2007;189:W215–W220 . \n74. van Leeuwen WF , Janssen SJ , Ring D , Chen N . Incidental magnetic resonance \nimaging signal changes in the extensor carpi radialis brevis origin are more \ncommon with age. J Shoulder Elbow Surg . 2016;25:1175–1181 . \n75. Rabago D , Lee KS , Ryan M , et al. Hypertonic dextrose and morrhuate sodium \ninjections (prolotherapy) for lateral epicondylosis (tennis elbow): results of a \nsingle-blind, pilot-level, randomized controlled trial. Am J Phys Med Rehabil . \n2013;92:587–596 . \n76. Scarpone M , Rabago DP , Zgierska A , Arbogast G , Snell E . The efficacy \nof prolotherapy for lateral epicondylosis: a pilot study. Clin J Sport Med .',
    '179. Dick FD , Graveling RA , Munro W , Walker-Bone K . Workplace management of \nupper limb disorders: a systematic review. Occup Med (Lond) . 2011;61:19–25 . \n180. Buchanan H , Van Niekerk L , Grimmer K . Work transition after hand injury: a \nscoping review. J Hand Ther . 2020 . \n181. Rost KA , Alvero AM . Participatory approaches to workplace safety man- \nagement: bridging the gap between behavioral safety and participatory er- \ngonomics. Int J Occup Saf Ergon . 2020;26:194–203 . \n182. Bernardes JM , Ruiz-Frutos C , Moro ARP , Dias A . A low-cost and efficient par- \nticipatory ergonomic intervention to reduce the burden of work-related mus- \nculoskeletal disorders in an industrially developing country: an experience re-',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.9545
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.9545
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.9545
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9832
cosine_mrr@10 0.9773
cosine_map@100 0.9773

Training Details

Training Dataset

Unnamed Dataset

  • Size: 812 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 812 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 12 tokens
    • mean: 24.85 tokens
    • max: 60 tokens
    • min: 8 tokens
    • mean: 158.41 tokens
    • max: 320 tokens
  • Samples:
    sentence_0 sentence_1
    2. What type of intervention is being compared to strength training in the study protocol by Sundstrup E and colleagues? ment for work-related lateral epicondylitis. Work . 2010;37:81–86 .
    161. Parimalam P , Premalatha MR , Padmini DS , Ganguli AK . Participatory er-
    gonomics in redesigning a dyeing tub for fabric dyers. Work . 2012;43:453–458 .
    162. Harari D , Casarotto RA . Effectiveness of a multifaceted intervention to manage
    musculoskeletal disorders in workers of a medium-sized company. Int J Occup
    Saf Ergon . 2021;27:247–257 .
    163. Sundstrup E , Jakobsen MD , Andersen CH , et al. Participatory ergonomic inter-
    vention versus strength training on chronic pain and work disability in slaugh-
    terhouse workers: study protocol for a single-blind, randomized controlled
    trial. BMC Musculoskelet Disord . 2013;14:67 .
    2. What does the increased signal intensity in the proximal portion of the lateral collateral ligament suggest about the patient's condition? 266 C.W. Stegink-Jansen, J.G. Bynum, A.L. Lambropoulos et al. / Journal of Hand Therapy 34 (2021) 263–297
    Fig. 3. Pathology. A 60-year-old female with right elbow pain for 5 weeks. (A) Coronal fat-suppressed FSE T2-weighted image showing mild thickening of the proximal
    portion of the common extensor tendon with increased signal intensity (arrow), suggesting mild injury. Irregular thickening with increased signal intensity in the proximal
    portion of the lateral collateral ligament (arrowhead) is also noted, suggesting mild injury. (B) and (C) Coronal PD FSE image and oblique radiograph showing cortical
    2. What factors are assessed in relation to lower extremity (LE) issues according to the systematic review? Manufacturing:
    • Electronics
    • Auto parts
    • Windows
    • Cabinets
    • Medical equipment
    • Fitness equipment
    Healthcare (excluding direct
    patient care):
    • Hospitals
    • Health research
    Worker: structured interviews,
    physical examinations
    Environment: workplace walk
    through
    Hazards in work: individual
    assessments of biomechanical and
    psychosocial factors
    LE related to frequency of forceful
    exertions or forearm supination and
    forceful lifting; increased odds of LE
    related to being age 36-50, female,
    or a smoker; high social support
    appeared protective against LE
    van Rijn et al. 2009 Assess relationship between
    work-related physical factors,
    psychosocial factors, and LE
    Systematic review
    13 studies:
    • 9 cross-sectional
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss cosine_ndcg@10
0.6098 50 - 1.0
1.0 82 - 0.9773
1.2195 100 - 0.9664
1.8293 150 - 1.0
2.0 164 - 0.9832
2.4390 200 - 0.9832
3.0 246 - 0.9832
3.0488 250 - 0.9832
3.6585 300 - 0.9832
4.0 328 - 0.9832
4.2683 350 - 0.9832
4.8780 400 - 0.9832
5.0 410 - 0.9832
5.4878 450 - 0.9832
6.0 492 - 0.9832
6.0976 500 0.6578 0.9832
6.7073 550 - 0.9664
7.0 574 - 0.9664
7.3171 600 - 0.9664
7.9268 650 - 0.9664
8.0 656 - 0.9664
8.5366 700 - 0.9832
9.0 738 - 0.9832
9.1463 750 - 0.9832
9.7561 800 - 0.9832
10.0 820 - 0.9832

Framework Versions

  • Python: 3.13.2
  • Sentence Transformers: 3.4.1
  • Transformers: 4.49.0
  • PyTorch: 2.6.0+cpu
  • Accelerate: 1.4.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}