SentenceTransformer based on sentence-transformers/all-distilroberta-v1

This is a sentence-transformers model finetuned from sentence-transformers/all-distilroberta-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/all-distilroberta-v1
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("distilroberta-pubmed-embeddings")
# Run inference
sentences = [
    'Is [ Diaphragmatic response influenced by previous muscle activity ]?',
    'Previous muscle activity can alter muscle contractility and lead to strength underestimation or overestimation in functional measurements. The objective of this study was to evaluate changes in the maximum pressure produced by the diaphragm after different series of spontaneous near-to-maximal isometric contractions. Duplicate studies were performed on 6 dogs with a mean (SD) weight of 26 (7) kg. The supramaximal response of the diaphragm was achieved by simultaneous supramaximal stimulation of both phrenic nerves, both under basal conditions and after series of 5, 10, and 20 spontaneous inspiratory efforts against the occluded airway, performed before and after spinal anesthesia (which eliminates the ventilatory contribution of the intercostal muscles). The response was measured using the twitch gastric pressure (Pga) and twitch esophageal pressure (Pes) and by muscle shortening (sonomicrometry). The short series of 5 inspiratory efforts and, in particular, the medium series of 10 efforts produced potentiation of the contractile response, with a rise in the Pga from 3.2 (0.4) cm H(2)O to 3.7 (0.3) cm H(2)O, and from 3.5 (0.3) cm H(2)O to 3.9 (0.3) cm H(2)O, respectively (P=.05 in both cases). The potentiation was somewhat greater after subarachnoid anesthesia (an increase in the Pga of 21% after the medium series of 10 efforts with anesthesia vs 11% without anesthesia). However, the long series of 20 efforts produced a fall in the response, with a decrease in the Pga from 3.2 (0.4) cm H(2)O to 2.5 (0.3) cm H(2)O (P< .05), probably due to fatigue overcoming the effect of potentiation.',
    'Recent studies have revealed aquaporins (AQPs) as targets for novel anti-tumor therapy since they are likely to play a role in carcinogenesis, tumor progression and invasion. Accordingly, we analyzed the prognostic impact of AQP3 expression and polymorphisms in a number of patients with early breast cancer (EBC). AQP3 expression was investigated on the basis of the immunohistochemistry of tissue microarray specimens from 447 EBC patients who underwent surgery between 2003 and 2008. We scored the staining intensity (0 through 3) and percentage of positive tumor cells (0 through 4); the staining score was defined as sum of these scores used to categorize the AQP3 expression as negative (0 through 2), weak (3 through 5) or strong (6 or more). For AQP3 polymorphisms, seven single nucleotide polymorphisms (SNPs) (rs10813981, rs34391490, rs2228332, rs2227285, rs591810, rs17553719 and rs3860987) were selected using in silico analysis and genotyped using the Sequenom MassARRAY. A total of 180 (40.3%) patients were identified as AQP3-positive (staining score >2), including 86 (19.2%) cases of strong expression (stating score >5). In a univariate analysis, AQP3 expression was significantly associated with survival for the patients with HER2-over-expressing EBC. Moreover, a multivariate survival analysis revealed that AQP3 expression was an independent prognostic marker of disease-free survival (DFS): hazard ratio (HR)=3.137, 95% confidence interval (CI)=1.079-9.125, p=0.036; distant DFS (DDFS): HR=2.784, 95%CI=0.921-8.414, p=0.070, for the HER2-over-expressing EBC patients. Meanwhile, none of selected AQP3 polymorphisms were related to AQP3 expression in tumor tissue or survival in the current study.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Dataset: ai-pubmbed-validation
Evaluated with TripletEvaluator

Metric	Value
cosine_accuracy	1.0

Training Details

Training Dataset

Unnamed Dataset

Size: 1,000 training samples
Columns: instruction, context, and context_neg

Approximate statistics based on the first 1000 samples:

	instruction	context	context_neg
type	string	string	string
details	min: 11 tokens mean: 26.36 tokens max: 67 tokens	min: 31 tokens mean: 320.03 tokens max: 512 tokens	min: 82 tokens mean: 321.22 tokens max: 512 tokens

Samples:

instruction	context	context_neg
`Do competency assessment of primary care physicians as part of a peer review program?`	To design and test a program that assesses clinical competence as a second stage in a peer review process and to determine the program's reliability. A three-cohort study of Ontario primary care physicians. Reference physicians (n = 26) randomly drawn from the Hamilton, Ontario, area; volunteer, self-referred physicians (n = 20); and physicians referred by the licensing body (n = 37) as a result of a disciplinary hearing or peer review. Standardized patients, structured oral examinations, chart-stimulated recall, objective structured clinical examination, and multiple-choice examination. Test reliability was high, ranging from 0.73 to 0.91, and all tests discriminated among subgroups. Demographic variables relating to the final category were age, Canadian or foreign graduates, and whether or not participants were certified in family medicine.	Static stretch is frequently observed in the lung. Both static stretch and cyclic stretch can induce cell death and Na(+)/K(+)-ATPase trafficking, but stretch-induced alveolar epithelial cell (AEC) functions are much less responsive to static than to cyclic stretch. AEC remodeling under static stretch may be partly explained. The aim of this study was to explore the AEC remodeling and functional changes under static stretch conditions. We used A549 cells as a model of AEC type II cells. We assessed F-actin content and cell viability by fluorescence staining at various static-stretch magnitudes and time points. Specifically, we used scanning electron microscopy to explore the possible biological mechanisms used by A549 cells to 'escape' static-stretch-induced injury. Finally, we measured choline cytidylyltransferase-alpha (CCT alpha) mRNA and protein by real-time PCR and Western blot to evaluate cellular secretory function. The results showed that the magnitude of static stretch was the...
`Is age an important determinant of the growth hormone response to sprint exercise in non-obese young men?`	The factors that regulate the growth hormone (GH) response to physiological stimuli, such as exercise, are not fully understood. The aim of the present study is to determine whether age, body composition, measures of sprint performance or the metabolic response to a sprint are predictors of the GH response to sprint exercise in non-obese young men. Twenty-seven healthy, non-obese males aged 18-32 years performed an all-out 30-second sprint on a cycle ergometer. Univariate linear regression analysis was employed to evaluate age-, BMI-, performance- and metabolic-dependent changes from pre-exercise to peak GH and integrated GH for 60 min after the sprint. GH was elevated following the sprint (change in GH: 17.0 +/- 14.2 microg l(-1); integrated GH: 662 +/- 582 min microg l(-1)). Performance characteristics, the metabolic response to exercise and BMI were not significant predictors of the GH response to exercise. However, age emerged as a significant predictor of both integrated GH (beta ...	We have previously reported the crucial roles of oncogenic Kirsten rat sarcoma viral oncogene homolog (KRAS) in inhibiting apoptosis and disrupting cell polarity via the regulation of phosphodiesterase 4 (PDE4) expression in human colorectal cancer HCT116 cells in three-dimensional cultures (3DC). Herein we evaluated the effects of resveratrol, a PDE4 inhibitor, on the luminal cavity formation and the induction of apoptosis in HCT116 cells. Apoptosis was detected by immunofluorescence using confocal laser scanning microscopy with an antibody against cleaved caspase-3 in HCT116 cells treated with or without resveratrol in a two-dimensional culture (2DC) or 3DC. Resveratrol did not induce apoptosis of HCT116 cells in 2DC, whereas the number of apoptotic HCT116 cells increased after resveratrol treatment in 3DC, leading to formation of a luminal cavity.
`Is terlipressin more effective in decreasing variceal pressure than portal pressure in cirrhotic patients?`	Terlipressin decreases portal pressure. However, its effects on variceal pressure have been poorly investigated. This study investigated the variceal, splanchnic and systemic hemodynamic effects of terlipressin. Twenty cirrhotic patients with esophageal varices grade II-III, and portal pressure > or =12 mmHg were studied. Hepatic venous pressure gradient, variceal pressure and systemic hemodynamic parameters were obtained. After baseline measurements, in a double-blind administration, 14 patients received a 2mg/iv injection of terlipressin and six patients received placebo. The same measurements were repeated 60 min later. No demographic or biochemical differences were observed in basal condition between groups. Terlipressin produced significant decreases in intravariceal pressure from 20.9+4.9 to 16.3+/-4.7 mmHg (p<0.01, -21+/- 16%), variceal pressure gradient from 18.9+/-4.8 to 13.5+/-6.0 mmHg (p<0.01, -28+/-27%), estimated variceal wall tension from 78+/-29 to 59+/-31 mmHg x mm (p<0...	Based on the theories of brain reserve and cognitive reserve, we investigated whether larger maximal lifetime brain growth (MLBG) and/or greater lifetime intellectual enrichment protect against cognitive decline over time. Forty patients with multiple sclerosis (MS) underwent baseline and 4.5-year follow-up evaluations of cognitive efficiency (Symbol Digit Modalities Test, Paced Auditory Serial Addition Task) and memory (Selective Reminding Test, Spatial Recall Test). Baseline and follow-up MRIs quantified disease progression: percentage brain volume change (cerebral atrophy), percentage change in T2 lesion volume. MLBG (brain reserve) was estimated with intracranial volume; intellectual enrichment (cognitive reserve) was estimated with vocabulary. We performed repeated-measures analyses of covariance to investigate whether larger MLBG and/or greater intellectual enrichment moderate/attenuate cognitive decline over time, controlling for disease progression. Patients with MS declined in...

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Evaluation Dataset

Unnamed Dataset

Size: 1,000 evaluation samples
Columns: instruction, context, and context_neg

Approximate statistics based on the first 1000 samples:

	instruction	context	context_neg
type	string	string	string
details	min: 9 tokens mean: 26.16 tokens max: 65 tokens	min: 16 tokens mean: 318.79 tokens max: 512 tokens	min: 69 tokens mean: 319.13 tokens max: 512 tokens

Samples:

instruction	context	context_neg
`Do [ The Relationship Between the Damages of Hand Functions and the Type of Cerebral Palsy in Children ]?`	To investigate the relationship between the damages of hand functions and the type of cerebral palsy (CP) in children with CP. A total of 280 children aged 4-12 years old with CP in the 20 districts of Chengdu were included. The damages of hand functions were assessed with the Chinese Version of Manual Ability Classification System (MACS) and its relationship with the type of CP were analyzed. Among the 280 investigated children, there were 195 chidren with spastic CP, which accounted for the largest proportion (69.64%), wherein the spastic diplegia was most common (56.41%). The classification of MACS was level I-II in 65.13% children with spastic CP, whereas the classification of MACS was level IU-V in 84. 44% and 80.95% children with mixed and dyskinetic CP, respectively. With the increase of the degree of cognitive dysfunction in children with CP, the level of MACS was also increased. There was a difference between the classification of MACS and the different type of CP (P<0.05). Th...	One-third of estrogen (ER+) and/or progesterone receptor-positive (PGR+) breast tumors treated with Tamoxifen (TAM) do not respond to initial treatment, and the remaining 70% are at risk to relapse in the future. Estrogen-related receptor gamma (ESRRG, ERRγ) is an orphan nuclear receptor with broad, structural similarities to classical ER that is widely implicated in the transcriptional regulation of energy homeostasis. We have previously demonstrated that ERRγ induces resistance to TAM in ER+ breast cancer models, and that the receptor's transcriptional activity is modified by activation of the ERK/MAPK pathway. We hypothesize that hyper-activation or over-expression of ERRγ induces a pro-survival transcriptional program that impairs the ability of TAM to inhibit the growth of ER+ breast cancer. The goal of the present study is to determine whether ERRγ target genes are associated with reduced distant metastasis-free survival (DMFS) in ER+ breast cancer treated with TAM. Raw gene expr...
`Does tissue-Engineered Skin Substitute enhance Wound Healing after Radiation Therapy?`	When given in conjunction with surgery for treating cancer, radiation therapy may result in impaired wound healing, which, in turn, could cause skin ulcers. In this study, bilayer and monolayer autologous skin substitutes were used to treat an irradiated wound. A single dose of 30 Gy of linear electron beam radiation was applied to the hind limb of nude mice before creating the skin lesion (area of 78.6 mm). Monolayer tissue-engineered skin substitutes (MTESSs) were prepared by entrapping cultured keratinocytes in fibrin matrix, and bilayer tissue-engineered skin substitutes (BTESSs) were prepared by entrapping keratinocytes and fibroblasts in separate layers. Bilayer tissue-engineered skin substitute and MTESS were implanted to the wound area. Gross appearance and wound area were analyzed to evaluate wound healing efficiency. Skin regeneration and morphological appearance were observed via histological and electron microscopy. Protein expressions of transforming growth factor β1 (TGF-...	The genetic basis of Alzheimer's disease (AD) is being analyzed in multiple whole genome association studies (WGAS). The GAB2 gene has been proposed as a modifying factor of APOE epsilon 4 allele in a recent case-control WGAS conducted in the US. Given the potential application of these novel results in AD diagnostics, we decided to make an independent replication to examine the GAB2 gene effect in our series. We are conducting a multicenter population-based study of AD in Spain. We analyzed a total of 1116 Spanish individuals. Specifically, 521 AD patients, 475 controls from the general population and 120 neurologically-normal elderly controls (NNE controls). We have genotyped GAB2 (rs2373115 G/T) and APOE rs429358 (SNP112)/rs7412 (SNP158) polymorphisms using real time-PCR technologies. As previously reported in Spain, APOE epsilon 4 allele was strongly associated with AD in our series (OR=2.88 [95% C.I. 2.16- 3.84], p=7.38E-11). Moreover, a large effect for epsilone 4/epsilone 4 geno...
`Does embryonic expression of EphA receptor genes in mice support their candidacy for involvement in cleft lip and palate?`	Eph receptors, comprising the A- and B-subfamilies, are the largest family of receptor tyrosine kinases in the mammalian genome, and their function is critical for morphogenesis in a variety of contexts. Whereas signaling through B-type Ephs has been demonstrated to play a role in cleft lip and palate (CL/P), the involvement of A-type Ephs has not been examined in this context notwithstanding a recent genome-wide association study that identified the EPHA3 locus as a candidate for non-syndromic CL/P. Here, we present a systematic analysis of the gene expression patterns for the nine EphA receptors at progressive stages of mouse development and find that EphA3, EphA4, and EphA7 exhibit restricted overlapping patterns of expression during palate development. We find that homozygous mutation of EphA3 or compound homozygous mutation of EphA3 and EphA4 in mice does not result in defective midfacial development, supporting the possibility of redundant function with EphA7. We also document pr...	Physical therapists and occupational therapists practicing in acute care hospitals play a crucial role in discharge planning. A standardized assessment of patients' function could be useful for discharge recommendations. The study objective was to determine the accuracy of "6-Clicks" basic mobility and daily activity measures for predicting discharge from an acute care hospital to a home or institutional setting. The study was retrospective and observational. "6-Clicks" scores obtained at initial visits by physical therapists or occupational therapists and patients' discharge destinations were used to develop and validate receiver operating characteristic curves for predicting discharge destination. Positive predictive values (PPV), negative predictive values (NPV), and likelihood ratios were calculated. Areas under the receiver operating characteristic curves for basic mobility scores were 0.857 (95% confidence interval [CI]=0.852, 0.862) and 0.855 (95% CI=0.850, 0.860) in development...

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
learning_rate: 2e-05
num_train_epochs: 1
warmup_ratio: 0.1
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	ai-pubmbed-validation_cosine_accuracy
0	0	1.0

Framework Versions

Python: 3.10.12
Sentence Transformers: 3.3.1
Transformers: 4.47.0
PyTorch: 2.5.1+cu121
Accelerate: 1.2.1
Datasets: 3.3.1
Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

aisuko
/

distilroberta-pubmed-embeddings

SentenceTransformer based on sentence-transformers/all-distilroberta-v1

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Evaluation

Metrics

Triplet

Training Details

Training Dataset

Unnamed Dataset

Evaluation Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

MultipleNegativesRankingLoss

Model tree for aisuko/distilroberta-pubmed-embeddings

Evaluation results