gte base trained on AllNLI triplets

This is a sentence-transformers model finetuned from Alibaba-NLP/gte-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Alibaba-NLP/gte-base-en-v1.5
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NewModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Judul: Formulasi Surfaktan Metil Ester Sulfonat sebagai Oil Well Cleaning\nAbstrak: Oil productivity reduction may be due to plugging in the oil rock formations. The plugging may be caused by the deposition of paraffin, asphaltene, and scale. Problem caused by the presence of the precipitate is the rock formation can be oil wet so that oil permeability decreases. The problem can be solved by well cleaning technique with surfactant formula. Surfactant MES is a type of anionic surfactant which has ability to lower the interfcial tension, surface tension, and able to change the properties of rock from oil wet to become water wet. Surfactant MES formula for well cleaning requires carrier agent. In this study, diesel oil and metil ester were used as carrying agent. Aromatic solvents were also needed. Xylene and toluene has ability to dissolve asphaltene that deposites in formation. Surfaktan formulation for well cleaning was done with several stages, those are determine the SMES concentration and aromatic solvents concentration. Surfactant performance tests for oil well cleaning were thermal stability, phase behavior, and wettability. The surfactant formula which gave the best performance was SMES 3% in metil ester carrying agent with xylene 15% as additive.\nKeyword: methyl sulfonic esters, oil well cleaning, Asphaltene',
    'Judul: Formulasi Surfaktan SMES sebagai Acid Stimulation Agent untuk Aplikasi di Lapangan Karbonat OK\nAbstrak: Methyl Sulfonic Esters (MES) is one type of anionic surfactants which have advantages in terms of its hardness, resistance to deterjensi, the character of renewable and environmentally friendly. Excess MES this can be utilized as stimulation agent in oil wells, so can increase productivity an oil well. Increased productivity an oil well done by means of cleaning oil wells and pore a reservoir fromsediment of scale formed, enlarging the pores of rocks and can changing the nature of rocks being water-wet. This research was carried out to obtain the formula of solution of surfactants-based MES that can be applied as acid stimulation agent that is one method of IOR. Formula tested is a combination of surfactants sodium MES, HCl, and CH3COOH. The formulation is done by determining the optimum concentration of surfactant SMES and HCl gradually. The best results obtained from the solution of acid stimulation agent was with value of IFT < 10-2 dyne/cm with solubility of rock reaches 36%, and was can to change the contact angle of the reservoir rocks of the contact angle number 420 became 680 in formula SMES 6% + HCl 7% and CH3COOH 2%.\nKeyword: acid well stimulation, IOR, IFT, Sodium Methyl Sulfonic Esters',
    'Judul: World Journal of Zoology\nAbstrak: A study on daily pattern of male western lowland gorilla (Gorilla gorilla gorilla, Savage & Wyman 1847) had been done at Schmutzer Primate Center, Taman Margasatwa Ragunan Jakarta, Indonesia. The aim of the study was to observe the daily activity pattern of adult male gorilla group without any female in captivity in order to obtain a condition of preparing incoming female gorillas leading to successfull conservation program.\nKeyword: ',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 1.0
dot_accuracy 0.0
manhattan_accuracy 1.0
euclidean_accuracy 1.0
max_accuracy 1.0

Triplet

Metric Value
cosine_accuracy 1.0
dot_accuracy 0.0
manhattan_accuracy 1.0
euclidean_accuracy 1.0
max_accuracy 1.0

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • gradient_accumulation_steps: 2
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • bf16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss all-nli-dev_max_accuracy all-nli-test_max_accuracy
0 0 - - 0.9998 -
0.0772 2000 0.0402 0.0164 1.0 -
0.1544 4000 0.0213 0.0135 1.0 -
0.2316 6000 0.0182 0.0115 1.0 -
0.3088 8000 0.015 0.0106 1.0 -
0.3860 10000 0.014 0.0094 1.0 -
0.4632 12000 0.0116 0.0085 1.0 -
0.5404 14000 0.0097 0.0072 1.0 -
0.6176 16000 0.0083 0.0056 1.0 -
0.6948 18000 0.0071 0.0050 1.0 -
0.7720 20000 0.0066 0.0046 1.0 -
0.8492 22000 0.0051 0.0034 1.0 -
0.9264 24000 0.0047 0.0031 1.0 -
1.0000 25907 - - - 1.0

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 3.1.0
  • Transformers: 4.44.2
  • PyTorch: 2.4.1+cu121
  • Accelerate: 0.34.2
  • Datasets: 3.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
4
Safetensors
Model size
137M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for arulpm/ipbgpt

Finetuned
(17)
this model

Evaluation results