pankajrajdeo's picture
Add new SentenceTransformer model.
bc6fd7c verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:33870508
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Wire-Free Targeted Axillary Dissection: A Pooled Analysis of 1300+ Cases
      Post-Neoadjuvant Systemic Therapy in Node-Positive Early Breast Cancer.
    sentences:
      - Transdiagnostic behavior therapy.
      - >-
        Recent advances in neoadjuvant systemic therapy between SLNB and MLNB
        was demonstrated. Notably, 49 days of cases, respectively. MLNB
        inclusion in axillary staging post-NST for initially node-positive
        patients is crucial. The radiation-free Savi Scout, with its minimal MRI
        artefacts, is the preferred technology for TAD.
      - >-
        Delirium affects approximately 20% to 25% of patients undergoing cardiac
        surgery and is particularly common in older adults. This article reviews
        the etiology and risk factors for delirium associated with cardiac
        surgery in older adults. Delirium screening, prevention, and treatment
        strategies, including both pharmacological and nonpharmacological
        therapies, are presented. Interventions appropriate in both the
        intensive care unit and.
  - source_sentence: >-
      Experimental studies on the geometrical characteristics determining the
      system behavior of surface tension autooscillations.
    sentences:
      - >-
        Autooscillation of the surface tension is a phenomenon related to
        Marangoni instability periodically arising and fading by dissolution of
        a surfactant droplet under a water-air interface. A detailed
        experimental investigation was performed to clear up the influence of
        the system geometry on development and characteristics of
        autooscillations. It was found that the aspect ratio is an additional
        dimensionless parameter that determines the system behavior equally to
        the Marangoni number. The influence of the cell diameter, capillary
        immersion depth, and droplet radius on the autooscillation period and
        amplitude was studied as well.
      - >-
        Genome-wide methylation profiling is used in breast cancer (BC) studies,
        because DNA methylation is a crucial epigenetic regulator of gene
        expression, involved in many diseases including BC. We investigated
        genome-wide methylation profiles in both canine mammary tumor (CMT)
        tissues and peripheral blood mononuclear cells (PBMCs) using reduced
        representation bisulfite sequencing (RRBS) and found unique CMT-enriched
        methylation signatures. A total of 2.2–4.2 million
        cytosine–phosphate–guanine (CpG) sites were analyzed in both CMT tissues
        and PBMCs, which included 40,000 and 28,000 differentially methylated
        regions (DMRs) associated with 341 and 247 promoters of differentially
        methylated genes (DMGs) in CMT tissues and PBMCs, respectively. Genes
        related to apoptosis and ion transmembrane transport were
        hypermethylated, but cell proliferation and oncogene were hypomethylated
        in tumor tissues. Gene ontology analysis using DMGs in PBMCs revealed
        significant methylation changes in the subset of immune cells and host
        defense system-related genes, especially chemokine signaling
        pathway-related genes. Moreover, a number of CMT tissue-enriched DMRs
        were identified from the promoter regions of various microRNAs (miRNAs),
        including cfa-mir-96 and cfa-mir-149, which were reported as
        cancer-associated miRNAs in humans. We also identified novel miRNAs
        associated with CMT which can be candidates for new miRNAs associated
        with human BC. This study may provide new insight for a better
        understanding of aberrant methylation associated with both human BC and
        CMT, as well as possible targets for methylation-based BC diagnostic
        markers.
      - >-
        Urine estrogens were measured in 46 women students, ages 15-18, at a
        middle-class high school in Athens and in 40 women of the same age
        residing at one of three orphanages in the same city. The lower
        socioeconomic status (SES) of the latter group was documented by their
        lower mean height (by 5.2 cm) and weight (by 5.3 kg) relative to the
        high school students. Both in follicular and luteal phases of the
        menstrual cycle, the women with lower SES had 50% higher estriol ratios
        (ratio of the concentration of estriol to the sum of the concentrations
        of estrone and estradiol). In luteal specimens the concentration of all
        three major estrogens was higher in the group with low SES than in the
        women in the other group, but the concentration of estriol was most
        increased. There was also an indication of less frequent anovular cycles
        among the women with low SES. These findings are consistent with
        hypotheses linking either the estriol ratio or the frequency of anovular
        cycles to breast cancer risk.
  - source_sentence: Iatrogenic superior vena cava syndrome treated with streptokinase.
    sentences:
      - >-
        The literature tells us that reflection offers a means to evaluate
        practice and to identify learning from our practice experiences. The
        following description of a practice incident will be discussed loosely
        in the light of Rolfe's 'Model of Nursing Praxis' as a means of
        exploring the theoretical exercise of 'reflection' within a proposed
        theoretical framework. It is hoped that the exercise will help to
        achieve some of the suggested positive endpoints of reflection, and
        provide insight and learning on an incident that was particularly
        powerful on both a personal and a professional level.
      - >-
        BACKGROUND: This study reported height prediction and longitudinal
        growth changes in Chinese pediatric patients with acute myeloid leukemia
        (AML) during and after treatment and their associations with outcomes.
        METHODS: Changes in 88 children with AML in percentages according to the
        growth percentile curve for Chinese boys/girls aged 2-18/0-2 years for
        body mass index (BMI), height, and weight from the time of diagnosis to
        2 years off therapy were evaluated. The outcomes of AML were compared
        among patients with different BMI levels. RESULTS: The proportion of
        underweight children (weight < 5th percentile) increased significantly
        from the initial diagnosis to the end of consolidation treatment. The
        proportion of patients with low BMI (BMI < 5th percentile) was highest
        (23.08%) during the consolidation phase, and no children were
        underweight, but 20% were overweight (BMI > 75th percentile) after 2
        years of drug withdrawal. Unhealthy BMI at the initial diagnosis and
        during intensive chemotherapy leads to poorer outcomes. For height, all
        patients were in the range of genetic height predicted based on their
        parents' height at final follow-up. CONCLUSIONS: Physicians should pay
        more attention to the changes in height and weight of children with AML
        at these crucial treatment stages and intervene in time.
      - >-
        The development of an iatrogenic superior vena cava syndrome secondary
        to a thrombosis from an indwelling Hickman catheter in a patient with
        ovarian carcinoma is presented. The patient was treated with a
        combination of streptokinase and heparin with successful and dramatic
        results. Streptokinase appears to be highly effective in the treatment
        of iatrogenic superior vena cava syndrome from Hickman catheters. It
        appears that the Hickman catheter may be safely left in situ
        post-treatment.
  - source_sentence: >-
      Cesarean delivery in a parturient with syringomyelia and worsening
      neurological symptoms.
    sentences:
      - >-
        A parturient presented at 35 weeks' gestation with worsening
        neurological symptoms caused by syringomyelia. She underwent urgent
        cesarean delivery. The etiology and anesthetic considerations for an
        obstetrical patient with syringomyelia are discussed.
      - >-
        Attachment of enterotoxigenic Escherichia coli to the human gut is
        considered an important early step in infection that leads to diarrhea.
        This attachment is mediated by pili, which belong to a limited number of
        serologically distinguishable types. Many of these pili require the
        product of rns, or a closely related gene, for their expression. We have
        located the major promoter for rns and found that although its sequence
        diverges significantly from a sigma-70 promoter consensus sequence, it
        is very strong. Transcription of rns is negatively regulated both at a
        region upstream of this promoter and at a region internal to the rns
        open reading frame. In addition, rns positively regulates its own
        transcription, probably by counteracting these two negative effects.
      - >-
        Purpose: Research exploring how places shape and interact with the lives
        of aging adults must be grounded in the places where aging adults live
        and participate. Combined participatory geospatial and qualitative
        methods have the potential to illuminate the complex processes enacted
        between person and place to create much-needed knowledge in this area.
        The purpose of this scoping review was to identify methods that can be
        used to study person-place relationships among aging adults and their
        neighborhoods by determining the extent and nature of research with
        aging adults that combines qualitative methods with participatory
        geospatial methods. Design and Methods: A systematic search of nine
        databases identified 1,965 articles published from 1995 to late 2015. We
        extracted data and assessed whether the geospatial and qualitative
        methods were supported by a specified methodology, the methods of data
        analysis, and the extent of integration of geospatial and qualitative
        methods. Results: Fifteen studies were included and used the photovoice
        method, global positioning system tracking plus interview, or go-along
        interviews. Most included articles provided sufficient detail about data
        collection methods, yet limited detail about methodologies supporting
        the study designs and/or data analysis. Implications: Approaches that
        combine participatory geospatial and qualitative methods are beginning
        to emerge in the aging literature. By more explicitly grounding studies
        in a methodology, better integrating different types of data during
        analysis, and reflecting on methods as they are applied, these methods
        can be further developed and utilized to provide crucial place-based
        knowledge that can support aging adults' health, well-being, engagement,
        and participation.
  - source_sentence: >-
      Development of an in vitro regeneration system from immature
      inflorescences and CRISPR/Cas9-mediated gene editing in sudangrass.
    sentences:
      - >-
        HIV envelope protein (Env) is the sole target of broadly neutralizing
        antibodies (BNAbs) that are capable of neutralizing diverse strains of
        HIV. While BNAbs develop spontaneously in a subset of HIV-infected
        patients, efforts to design an envelope protein-based immunogen to
        elicit broadly neutralizing antibody responses have so far been
        unsuccessful. It is hypothesized that a primary barrier to eliciting
        BNAbs is the fact that HIV envelope proteins bind poorly to the
        germline-encoded unmutated common ancestor (UCA) precursors to BNAbs. To
        identify variant forms of Env with increased affinities for the UCA
        forms of BNAbs 4E10 and 10E8, which target the Membrane Proximal
        External Region (MPER) of Env, libraries of randomly mutated Env
        variants were expressed in a yeast surface display system and screened
        using fluorescence activated cell sorting for cells displaying variants
        with enhanced abilities to bind the UCA antibodies. Based on analyses of
        individual clones obtained from the screen and on next-generation
        sequencing of sorted libraries, distinct but partially overlapping sets
        of amino acid substitutions conferring enhanced UCA antibody binding
        were identified. These were particularly enriched in substitutions of
        arginine for highly conserved tryptophan residues. The UCA-binding
        variants also generally exhibited enhanced binding to the mature forms
        of anti-MPER antibodies. Mapping of the identified substitutions into
        available structures of Env suggest that they may act by destabilizing
        both the initial pre-fusion conformation and the six-helix bundle
        involved in fusion of the viral and cell membranes, as well as providing
        new or expanded epitopes with increased accessibility for the UCA
        antibodies.
      - >-
        BACKGROUND: Sudangrass (Sorghum sudanense) is a major biomass producer
        for livestock feed and biofuel in many countries. It has a wide range of
        adaptations for growing on marginal lands under biotic and abiotic
        stresses. The immature inflorescence is an explant with high embryogenic
        competence and is frequently used to regenerate different sorghum
        cultivars. Caffeic acid O-methyl transferase (COMT) is a key enzyme in
        the lignin biosynthesis pathway, which limits ruminant digestion of
        forage cell walls and is a crucial barrier in the conversion of plant
        biomass to bioethanol. Genome editing by CRISPR/Cas9-mediated
        mutagenesis without a transgenic footprint will accelerate the
        improvement and facilitate regulatory approval and commercialization of
        biotech crops. METHODS AND RESULTS: We report the overcome of the
        recalcitrance in sudangrass transformation and regeneration in order to
        use genome editing technique. Hence, an efficient regeneration system
        has been established to induce somatic embryogenesis from the immature
        inflorescence of two sudangrass cultivars on four MS-based media
        supplemented with different components. Our results indicate an
        interaction between genotype and medium composition. The combination of
        Giza-1 cultivar and M4 medium produces the maximum frequency of
        embryogenic calli of 80% and subsequent regeneration efficiency of
        22.6%. Precise mutagenesis of the COMT gene is executed using the
        CRISPR/Cas9 system with the potential to reduce lignin content and
        enhance forage and biomass quality in sudangrass. CONCLUSION: A reliable
        regeneration and transformation system has been established for
        sudangrass using immature inflorescence, and the CRISPR/Cas9 system has
        demonstrated a promising technology for genome editing. The outcomes of
        this research will pave the road for further improvement of various
        sorghum genotypes to meet the global demand for food, feed, and
        biofuels, achieving sustainable development goals (SDGs).
      - >-
        The synthesis of an extracellular matrix containing long (approximately
        mm in length) collagen fibrils is fundamental to the normal
        morphogenesis of animal tissues. In this study we have direct evidence
        that fibroblasts synthesise transient early fibril intermediates
        (approximately 1 micrometer in length) that interact by tip-to-tip
        fusion to generate long fibrils seen in older tissues. Examination of
        early collagen fibrils from tendon showed that two types of early
        fibrils occur: unipolar fibrils (with carboxyl (C) and amino (N) ends)
        and bipolar fibrils (with two N-ends). End-to-end fusion requires the
        C-end of a unipolar fibril. Proteoglycans coated the shafts of the
        fibrils but not the tips. In the absence of proteoglycans the fibrils
        aggregated by side-to-side interactions. Therefore, proteoglycans
        promote tip-to-tip fusion and inhibit side-to-side fusion. This
        distribution of proteoglycan along the fibril required co-assembly of
        collagen and proteoglycan prior to fibril assembly. The study showed
        that collagen fibrillogenesis is a hierarchical process that depends on
        the unique structure of unipolar fibrils and a novel function of
        proteoglycans.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer

This is a sentence-transformers model trained on the parquet dataset. It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 512 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • parquet

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 512, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("pankajrajdeo/Bioformer-8L-UMLS-Pubmed_PMC-Random_TCE-Epoch-1")
# Run inference
sentences = [
    'Development of an in vitro regeneration system from immature inflorescences and CRISPR/Cas9-mediated gene editing in sudangrass.',
    'BACKGROUND: Sudangrass (Sorghum sudanense) is a major biomass producer for livestock feed and biofuel in many countries. It has a wide range of adaptations for growing on marginal lands under biotic and abiotic stresses. The immature inflorescence is an explant with high embryogenic competence and is frequently used to regenerate different sorghum cultivars. Caffeic acid O-methyl transferase (COMT) is a key enzyme in the lignin biosynthesis pathway, which limits ruminant digestion of forage cell walls and is a crucial barrier in the conversion of plant biomass to bioethanol. Genome editing by CRISPR/Cas9-mediated mutagenesis without a transgenic footprint will accelerate the improvement and facilitate regulatory approval and commercialization of biotech crops. METHODS AND RESULTS: We report the overcome of the recalcitrance in sudangrass transformation and regeneration in order to use genome editing technique. Hence, an efficient regeneration system has been established to induce somatic embryogenesis from the immature inflorescence of two sudangrass cultivars on four MS-based media supplemented with different components. Our results indicate an interaction between genotype and medium composition. The combination of Giza-1 cultivar and M4 medium produces the maximum frequency of embryogenic calli of 80% and subsequent regeneration efficiency of 22.6%. Precise mutagenesis of the COMT gene is executed using the CRISPR/Cas9 system with the potential to reduce lignin content and enhance forage and biomass quality in sudangrass. CONCLUSION: A reliable regeneration and transformation system has been established for sudangrass using immature inflorescence, and the CRISPR/Cas9 system has demonstrated a promising technology for genome editing. The outcomes of this research will pave the road for further improvement of various sorghum genotypes to meet the global demand for food, feed, and biofuels, achieving sustainable development goals (SDGs).',
    'HIV envelope protein (Env) is the sole target of broadly neutralizing antibodies (BNAbs) that are capable of neutralizing diverse strains of HIV. While BNAbs develop spontaneously in a subset of HIV-infected patients, efforts to design an envelope protein-based immunogen to elicit broadly neutralizing antibody responses have so far been unsuccessful. It is hypothesized that a primary barrier to eliciting BNAbs is the fact that HIV envelope proteins bind poorly to the germline-encoded unmutated common ancestor (UCA) precursors to BNAbs. To identify variant forms of Env with increased affinities for the UCA forms of BNAbs 4E10 and 10E8, which target the Membrane Proximal External Region (MPER) of Env, libraries of randomly mutated Env variants were expressed in a yeast surface display system and screened using fluorescence activated cell sorting for cells displaying variants with enhanced abilities to bind the UCA antibodies. Based on analyses of individual clones obtained from the screen and on next-generation sequencing of sorted libraries, distinct but partially overlapping sets of amino acid substitutions conferring enhanced UCA antibody binding were identified. These were particularly enriched in substitutions of arginine for highly conserved tryptophan residues. The UCA-binding variants also generally exhibited enhanced binding to the mature forms of anti-MPER antibodies. Mapping of the identified substitutions into available structures of Env suggest that they may act by destabilizing both the initial pre-fusion conformation and the six-helix bundle involved in fusion of the viral and cell membranes, as well as providing new or expanded epitopes with increased accessibility for the UCA antibodies.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

parquet

  • Dataset: parquet
  • Size: 33,870,508 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 3 tokens
    • mean: 22.56 tokens
    • max: 64 tokens
    • min: 12 tokens
    • mean: 250.53 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    Characteristics of the HIV/AIDS Epidemic among People Aged ≥ 50 Years in China during 2018-2021. Objective: This study aimed to determine the current epidemiological status of PLWHA aged ≥ 50 years in China from 2018 to 2021. It also aimed to recommend targeted interventions for the prevention and treatment of HIV/AIDS in elderly patients. Methods: Data on newly reported cases of PLWHA, aged ≥ 50 years in China from 2018 to 2021, were collected using the CRIMS. Trend tests and spatial analyses were also conducted. Results: Between 2018 and 2021, 237,724 HIV/AIDS cases were reported among patients aged ≥ 50 years in China. The main transmission route was heterosexual transmission (91.24%). Commercial heterosexual transmission (CHC) was the primary mode of transmission among males, while non-marital non-CHC ([NMNCHC]; 60.59%) was the prevalent route in women. The proportion of patients with CHC decreased over time ( Z = 67.716, P < 0.01), while that of patients with NMNCHC increased ( Z = 153.05, P < 0.01). The sex ratio varied among the different modes of infection, and it peaked a...
    Obstructive sleep apnea syndrome: A frequent and difficult-to-detect complication of radiotherapy for oropharyngeal cancers. This pilot study reveals a higher prevalence of obstructive sleep apnea syndrome (OSAS) in patients treated for oropharyngeal squamous cell carcinoma with radiotherapy compared to the general population. OSAS indicators such as the Epworth Sleepiness Scale seem insufficient in the diagnostic approach to OSAS in this population and systematic screenings should be considered.
    Two new JK silencing alleles identified by single molecule sequencing with 20-Kb long-reads. BACKGROUND: The Kidd blood group gene SLC14A1 and JK02 having c.499A>G, c.588A>G, and c.743C>A (p.Ala248Asp). The two JK alleles identified have not been previously described. Transfection and expression studies indicated that the CHO cells transfected with JK02 having c.743C>A did not express the Jkb and Jk3 antigens. CONCLUSIONS: We identified new JK silencing alleles and their critical SNVs by single-molecule sequencing and the findings were confirmed by transfection and expression studies.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

parquet

  • Dataset: parquet
  • Size: 33,870,508 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 3 tokens
    • mean: 22.47 tokens
    • max: 95 tokens
    • min: 7 tokens
    • mean: 251.6 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    Causes and Management of Endoscopic Retrograde Cholangiopancreatography-Related Perforation: A Retrospective Study. BACKGROUND: Endoscopic retrograde cholangiopancreatography of ERCP-related perforation and conducted a retrospective review. RESULTS: Of the 15 patients, 6 were female and 9 were male, and the mean age was 77.1 years. According to Stapfer's classification, the 15 cases of ERCP-related perforation comprised 3 type I (duodenum), 3 type II (periampullary), 9 type III (distal bile duct or pancreatic duct), and no type IV cases. Fourteen of 15 (92.6%) were diagnosed during ERCP. The main cause of perforation was scope-induced damage, endoscopic sphincterotomy, and instrumentation penetration in type I, II, and III cases, respectively. Four patients with severe abdominal pain and extraluminal fluid collection underwent emergency surgery for repair and drainage. One type III patient with distal bile duct cancer underwent pancreaticoduodenectomy on day 6. Three type III patients with only retroperitoneal gas on computed tomography (CT) performed immediately after ERCP had no symptoms and neede...
    Covariance among premating, post-copulatory and viability fitness components in Drosophila melanogaster and their influence on paternity measurement. In polyandrous mating systems, male fitness depends on success in premating, post-copulatory and offspring viability episodes of selection. We tracked male success across all of these episodes simultaneously, using transgenic Drosophila melanogaster with ubiquitously expressed green fluorescent protein (that is GFP) in a series of competitive and noncompetitive matings. This approach permitted us to track paternity-specific viability over all life stages and to distinguish true competitive fertilization success from differential early offspring viability. Relationships between episodes of selection were generally not present when paternity was measured in eggs; however, positive correlations between sperm competitive success and offspring viability became significant when paternity was measured in adult offspring. Additionally, we found a significant male × female interaction on hatching success and a lack of repeatability of offspring viability across a focal male's matings, which may...
    Strategic partnerships to improve surgical care in the Asia–Pacific region: proceedings Emergency and essential surgery is a critical component of universal health coverage. Session three of the three-part virtual meeting series on Strategic Planning to Improve Surgical, Obstetric, Anaesthesia, and Trauma Care in the Asia–Pacific Region focused on strategic partnerships. During this session, a range of partner organisations, including intergovernmental organisations, professional associations, academic and research institutions, non-governmental organisations, and the private sector provided an update on their work in surgical system strengthening in the Asia–Pacific region. Partner organisations could provide technical and implementation support for National Surgical, Obstetric, and Anaesthesia Planning (NSOAP) in a number of areas, including workforce strengthening, capacity building, guideline development, monitoring and evaluation, and service delivery. Participants emphasised the importance of several forms of strategic collaboration: 1) collaboration across the spec...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 128
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • max_steps: 251382
  • log_level: info
  • fp16: True
  • dataloader_num_workers: 16
  • load_best_model_at_end: True
  • resume_from_checkpoint: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: 251382
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: info
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 16
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: True
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0000 1 1.6269 -
0.0040 1000 0.2123 -
0.0080 2000 0.1191 -
0.0119 3000 0.0948 -
0.0159 4000 0.0824 -
0.0199 5000 0.0708 -
0.0239 6000 0.0665 -
0.0278 7000 0.0612 -
0.0318 8000 0.0578 -
0.0358 9000 0.0542 -
0.0398 10000 0.0528 -
0.0438 11000 0.0505 -
0.0477 12000 0.0461 -
0.0517 13000 0.0468 -
0.0557 14000 0.0442 -
0.0597 15000 0.0435 -
0.0636 16000 0.0414 -
0.0676 17000 0.0421 -
0.0716 18000 0.0399 -
0.0756 19000 0.0409 -
0.0796 20000 0.0393 -
0.0835 21000 0.0369 -
0.0875 22000 0.0349 -
0.0915 23000 0.0361 -
0.0955 24000 0.0358 -
0.0994 25000 0.0348 -
0.1034 26000 0.032 -
0.1074 27000 0.0341 -
0.1114 28000 0.0339 -
0.1154 29000 0.0325 -
0.1193 30000 0.0331 -
0.1233 31000 0.0306 -
0.1273 32000 0.0302 -
0.1313 33000 0.0304 -
0.1353 34000 0.0304 -
0.1392 35000 0.0306 -
0.1432 36000 0.0291 -
0.1472 37000 0.0273 -
0.1512 38000 0.0284 -
0.1551 39000 0.0292 -
0.1591 40000 0.0287 -
0.1631 41000 0.0277 -
0.1671 42000 0.0283 -
0.1711 43000 0.0268 -
0.1750 44000 0.027 -
0.1790 45000 0.0268 -
0.1830 46000 0.0259 -
0.1870 47000 0.0257 -
0.1909 48000 0.0252 -
0.1949 49000 0.0257 -
0.1989 50000 0.026 -
0.2029 51000 0.0262 -
0.2069 52000 0.0253 -
0.2108 53000 0.0252 -
0.2148 54000 0.025 -
0.2188 55000 0.0234 -
0.2228 56000 0.0233 -
0.2267 57000 0.0239 -
0.2307 58000 0.023 -
0.2347 59000 0.0246 -
0.2387 60000 0.0232 -
0.2427 61000 0.0244 -
0.2466 62000 0.0238 -
0.2506 63000 0.0231 -
0.2546 64000 0.0231 -
0.2586 65000 0.0226 -
0.2625 66000 0.0233 -
0.2665 67000 0.022 -
0.2705 68000 0.0222 -
0.2745 69000 0.0227 -
0.2785 70000 0.0232 -
0.2824 71000 0.0221 -
0.2864 72000 0.0223 -
0.2904 73000 0.0224 -
0.2944 74000 0.0218 -
0.2983 75000 0.0216 -
0.3023 76000 0.0213 -
0.3063 77000 0.0206 -
0.3103 78000 0.0214 -
0.3143 79000 0.0215 -
0.3182 80000 0.022 -
0.3222 81000 0.0209 -
0.3262 82000 0.0211 -
0.3302 83000 0.0215 -
0.3342 84000 0.0205 -
0.3381 85000 0.0201 -
0.3421 86000 0.0198 -
0.3461 87000 0.0208 -
0.3501 88000 0.0206 -
0.3540 89000 0.0193 -
0.3580 90000 0.0217 -
0.3620 91000 0.0197 -
0.3660 92000 0.0206 -
0.3700 93000 0.0193 -
0.3739 94000 0.019 -
0.3779 95000 0.0197 -
0.3819 96000 0.02 -
0.3859 97000 0.0176 -
0.3898 98000 0.0198 -
0.3938 99000 0.0186 -
0.3978 100000 0.0191 -
0.4018 101000 0.0187 -
0.4058 102000 0.0192 -
0.4097 103000 0.0183 -
0.4137 104000 0.0192 -
0.4177 105000 0.019 -
0.4217 106000 0.0179 -
0.4256 107000 0.0195 -
0.4296 108000 0.0183 -
0.4336 109000 0.018 -
0.4376 110000 0.0187 -
0.4416 111000 0.0178 -
0.4455 112000 0.0178 -
0.4495 113000 0.0181 -
0.4535 114000 0.0176 -
0.4575 115000 0.0189 -
0.4614 116000 0.0181 -
0.4654 117000 0.0185 -
0.4694 118000 0.0178 -
0.4734 119000 0.0183 -
0.4774 120000 0.0171 -
0.4813 121000 0.0164 -
0.4853 122000 0.0177 -
0.4893 123000 0.0184 -
0.4933 124000 0.0169 -
0.4972 125000 0.017 -
0.5012 126000 0.0174 -
0.5052 127000 0.0175 -
0.5092 128000 0.0167 -
0.5132 129000 0.0178 -
0.5171 130000 0.018 -
0.5211 131000 0.0175 -
0.5251 132000 0.0174 -
0.5291 133000 0.0176 -
0.5331 134000 0.0179 -
0.5370 135000 0.0171 -
0.5410 136000 0.0175 -
0.5450 137000 0.0175 -
0.5490 138000 0.0166 -
0.5529 139000 0.0168 -
0.5569 140000 0.0164 -
0.5609 141000 0.0163 -
0.5649 142000 0.0161 -
0.5689 143000 0.0169 -
0.5728 144000 0.0162 -
0.5768 145000 0.0171 -
0.5808 146000 0.0163 -
0.5848 147000 0.0163 -
0.5887 148000 0.0163 -
0.5927 149000 0.0164 -
0.5967 150000 0.0159 -
0.6007 151000 0.0164 -
0.6047 152000 0.0167 -
0.6086 153000 0.0167 -
0.6126 154000 0.0166 -
0.6166 155000 0.0157 -
0.6206 156000 0.0162 -
0.6245 157000 0.0164 -
0.6285 158000 0.0164 -
0.6325 159000 0.016 -
0.6365 160000 0.0162 -
0.6405 161000 0.0154 -
0.6444 162000 0.015 -
0.6484 163000 0.0158 -
0.6524 164000 0.0157 -
0.6564 165000 0.0165 -
0.6603 166000 0.0149 -
0.6643 167000 0.0159 -
0.6683 168000 0.0154 -
0.6723 169000 0.0156 -
0.6763 170000 0.0153 -
0.6802 171000 0.0155 -
0.6842 172000 0.0158 -
0.6882 173000 0.0144 -
0.6922 174000 0.0154 -
0.6961 175000 0.0153 -
0.7001 176000 0.0149 -
0.7041 177000 0.0152 -
0.7081 178000 0.0157 -
0.7121 179000 0.0148 -
0.7160 180000 0.0146 -
0.7200 181000 0.0152 -
0.7240 182000 0.0151 -
0.7280 183000 0.0159 -
0.7320 184000 0.0147 -
0.7359 185000 0.0139 -
0.7399 186000 0.0149 -
0.7439 187000 0.0143 -
0.7479 188000 0.0145 -
0.7518 189000 0.0154 -
0.7558 190000 0.0151 -
0.7598 191000 0.0155 -
0.7638 192000 0.016 -
0.7678 193000 0.0139 -
0.7717 194000 0.0154 -
0.7757 195000 0.0138 -
0.7797 196000 0.0147 -
0.7837 197000 0.0152 -
0.7876 198000 0.0141 -
0.7916 199000 0.0142 -
0.7956 200000 0.0149 -
0.7996 201000 0.0142 -
0.8036 202000 0.015 -
0.8075 203000 0.0142 -
0.8115 204000 0.0152 -
0.8155 205000 0.0142 -
0.8195 206000 0.0141 -
0.8234 207000 0.0146 -
0.8274 208000 0.014 -
0.8314 209000 0.0146 -
0.8354 210000 0.0138 -
0.8394 211000 0.0141 -
0.8433 212000 0.0143 -
0.8473 213000 0.0139 -
0.8513 214000 0.0138 -
0.8553 215000 0.0146 -
0.8592 216000 0.014 -
0.8632 217000 0.0138 -
0.8672 218000 0.0143 -
0.8712 219000 0.0151 -
0.8752 220000 0.0146 -
0.8791 221000 0.0135 -
0.8831 222000 0.0136 -
0.8871 223000 0.0139 -
0.8911 224000 0.0136 -
0.8950 225000 0.0142 -
0.8990 226000 0.0134 -
0.9030 227000 0.0143 -
0.9070 228000 0.0142 -
0.9110 229000 0.0142 -
0.9149 230000 0.0138 -
0.9189 231000 0.0136 -
0.9229 232000 0.0138 -
0.9269 233000 0.0144 -
0.9309 234000 0.0137 -
0.9348 235000 0.0135 -
0.9388 236000 0.014 -
0.9428 237000 0.014 -
0.9468 238000 0.0136 -
0.9507 239000 0.0134 -
0.9547 240000 0.0144 -
0.9587 241000 0.0136 -
0.9627 242000 0.014 -
0.9667 243000 0.0138 -
0.9706 244000 0.0133 -
0.9746 245000 0.0142 -
0.9786 246000 0.0135 -
0.9826 247000 0.013 -
0.9865 248000 0.0138 -
0.9905 249000 0.0146 -
0.9945 250000 0.0142 -
0.9985 251000 0.0134 -
1.0000 251382 - 0.0013

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.2
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}