SentenceTransformer based on intfloat/multilingual-e5-base
This is a sentence-transformers model finetuned from intfloat/multilingual-e5-base on the rozetka_positive_pairs dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: intfloat/multilingual-e5-base
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Dot Product
- Training Dataset:
- rozetka_positive_pairs
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
RZTKSentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("rztk/multilingual-e5-base-matryoshka2d-mnr-3")
# Run inference
sentences = [
'query: мебель для кухни',
'passage: Кухня Эко модуль Вытяжка 600 Эверест Ясень Шимо Светлый 60х30х28 см',
'passage: Ключниці кишенькові Karya Гарантія 14 днів Для кого Для жінок Колір Червоний Матеріал Шкіра Країна реєстрації бренда Туреччина Країна-виробник товару Туреччина',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
rozetka_positive_pairs
- Dataset: rozetka_positive_pairs
- Size: 58,620,066 training samples
- Columns:
query
andtext
- Approximate statistics based on the first 1000 samples:
query text type string string details - min: 6 tokens
- mean: 11.27 tokens
- max: 30 tokens
- min: 11 tokens
- mean: 59.47 tokens
- max: 512 tokens
- Samples:
query text query: xsiomi 9c скло
passage: Защитные стекла Назначение Для мобильных телефонов Цвет Черный Теги Теги Наличие рамки C рамкой Форм-фактор Плоское Клеевой слой По всей поверхности
query: xsiomi 9c скло
passage: Захисне скло Призначення Для мобільних телефонів Колір Чорний Теги Теги Наявність рамки З рамкою Форм-фактор Плоске Клейовий шар По всій поверхні
query: xsiomi 9c скло
passage: Захисне скло Glass Full Glue для Xiaomi Redmi 9A/9C/10A (Чорний)
- Loss:
sentence_transformers_training.model.matryoshka2d_loss.RZTKMatryoshka2dLoss
with these parameters:{ "loss": "RZTKMultipleNegativesRankingLoss", "n_layers_per_step": 1, "last_layer_weight": 1.0, "prior_layers_weight": 1.0, "kl_div_weight": 1.0, "kl_temperature": 0.3, "matryoshka_dims": [ 768, 512, 256, 128 ], "matryoshka_weights": [ 1, 1, 1, 1 ], "n_dims_per_step": 1 }
Evaluation Dataset
rozetka_positive_pairs
- Dataset: rozetka_positive_pairs
- Size: 1,903,728 evaluation samples
- Columns:
query
andtext
- Approximate statistics based on the first 1000 samples:
query text type string string details - min: 6 tokens
- mean: 8.36 tokens
- max: 16 tokens
- min: 8 tokens
- mean: 45.68 tokens
- max: 365 tokens
- Samples:
query text query: создаем нейронную сеть
passage: Створюємо нейронну мережу
query: создаем нейронную сеть
passage: Создаем нейронную сеть (1666498)
query: создаем нейронную сеть
passage: Научная и техническая литература Переплет Мягкий
- Loss:
sentence_transformers_training.model.matryoshka2d_loss.RZTKMatryoshka2dLoss
with these parameters:{ "loss": "RZTKMultipleNegativesRankingLoss", "n_layers_per_step": 1, "last_layer_weight": 1.0, "prior_layers_weight": 1.0, "kl_div_weight": 1.0, "kl_temperature": 0.3, "matryoshka_dims": [ 768, 512, 256, 128 ], "matryoshka_weights": [ 1, 1, 1, 1 ], "n_dims_per_step": 1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 88per_device_eval_batch_size
: 88learning_rate
: 2e-05num_train_epochs
: 1.0warmup_ratio
: 0.1bf16
: Truebf16_full_eval
: Truetf32
: Truedataloader_num_workers
: 8load_best_model_at_end
: Trueoptim
: adafactorpush_to_hub
: Truehub_model_id
: rztk/multilingual-e5-base-matryoshka2d-mnr-3hub_private_repo
: Trueprompts
: {'query': 'query: ', 'text': 'passage: '}batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 88per_device_eval_batch_size
: 88per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1.0max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Truefp16_full_eval
: Falsetf32
: Truelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Truedataloader_num_workers
: 8dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adafactoroptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Trueresume_from_checkpoint
: Nonehub_model_id
: rztk/multilingual-e5-base-matryoshka2d-mnr-3hub_strategy
: every_savehub_private_repo
: Truehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: {'query': 'query: ', 'text': 'passage: '}batch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportionalddp_static_graph
: Falseddp_comm_hook
: bf16gradient_as_bucket_view
: Falsenum_proc
: 30
Training Logs
Click to expand
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0050 | 833 | 4.8404 | - |
0.0100 | 1666 | 4.6439 | - |
0.0150 | 2499 | 4.2238 | - |
0.0200 | 3332 | 3.5445 | - |
0.0250 | 4165 | 2.7514 | - |
0.0300 | 4998 | 2.4037 | - |
0.0350 | 5831 | 2.1916 | - |
0.0400 | 6664 | 2.0938 | - |
0.0450 | 7497 | 1.9268 | - |
0.0500 | 8330 | 1.8671 | - |
0.0550 | 9163 | 1.7069 | - |
0.0600 | 9996 | 1.6419 | - |
0.0650 | 10829 | 1.55 | - |
0.0700 | 11662 | 1.5483 | - |
0.0750 | 12495 | 1.5419 | - |
0.0800 | 13328 | 1.3582 | - |
0.0850 | 14161 | 1.3537 | - |
0.0900 | 14994 | 1.3067 | - |
0.0950 | 15827 | 1.2128 | - |
0.1000 | 16654 | - | 1.0107 |
0.1000 | 16660 | 1.2248 | - |
0.1050 | 17493 | 1.1565 | - |
0.1100 | 18326 | 1.1351 | - |
0.1150 | 19159 | 1.0808 | - |
0.1200 | 19992 | 1.0561 | - |
0.1250 | 20825 | 1.078 | - |
0.1301 | 21658 | 1.1413 | - |
0.1351 | 22491 | 1.0446 | - |
0.1401 | 23324 | 0.9986 | - |
0.1451 | 24157 | 0.9668 | - |
0.1501 | 24990 | 0.9753 | - |
0.1551 | 25823 | 1.0031 | - |
0.1601 | 26656 | 0.9688 | - |
0.1651 | 27489 | 0.9262 | - |
0.1701 | 28322 | 0.9702 | - |
0.1751 | 29155 | 0.9082 | - |
0.1801 | 29988 | 0.9264 | - |
0.1851 | 30821 | 0.8526 | - |
0.1901 | 31654 | 0.9667 | - |
0.1951 | 32487 | 0.9421 | - |
0.2000 | 33308 | - | 0.6416 |
0.2001 | 33320 | 0.9216 | - |
0.2051 | 34153 | 0.95 | - |
0.2101 | 34986 | 0.8895 | - |
0.2151 | 35819 | 0.8349 | - |
0.2201 | 36652 | 0.8628 | - |
0.2251 | 37485 | 0.8729 | - |
0.2301 | 38318 | 0.9285 | - |
0.2351 | 39151 | 0.8718 | - |
0.2401 | 39984 | 0.8792 | - |
0.2451 | 40817 | 0.8852 | - |
0.2501 | 41650 | 0.877 | - |
0.2551 | 42483 | 0.8325 | - |
0.2601 | 43316 | 0.8446 | - |
0.2651 | 44149 | 0.812 | - |
0.2701 | 44982 | 0.8246 | - |
0.2751 | 45815 | 0.8086 | - |
0.2801 | 46648 | 0.8553 | - |
0.2851 | 47481 | 0.8506 | - |
0.2901 | 48314 | 0.834 | - |
0.2951 | 49147 | 0.8313 | - |
0.3000 | 49962 | - | 0.5377 |
0.3001 | 49980 | 0.8376 | - |
0.3051 | 50813 | 0.7836 | - |
0.3101 | 51646 | 0.8089 | - |
0.3151 | 52479 | 0.8065 | - |
0.3201 | 53312 | 0.8284 | - |
0.3251 | 54145 | 0.7959 | - |
0.3301 | 54978 | 0.8332 | - |
0.3351 | 55811 | 0.7924 | - |
0.3401 | 56644 | 0.8171 | - |
0.3451 | 57477 | 0.7924 | - |
0.3501 | 58310 | 0.7977 | - |
0.3551 | 59143 | 0.7729 | - |
0.3601 | 59976 | 0.7617 | - |
0.3651 | 60809 | 0.8211 | - |
0.3701 | 61642 | 0.8497 | - |
0.3751 | 62475 | 0.8218 | - |
0.3802 | 63308 | 0.7846 | - |
0.3852 | 64141 | 0.7876 | - |
0.3902 | 64974 | 0.7912 | - |
0.3952 | 65807 | 0.7977 | - |
0.4000 | 66616 | - | 0.4974 |
0.4002 | 66640 | 0.8096 | - |
0.4052 | 67473 | 0.8356 | - |
0.4102 | 68306 | 0.788 | - |
0.4152 | 69139 | 0.7683 | - |
0.4202 | 69972 | 0.7358 | - |
0.4252 | 70805 | 0.7634 | - |
0.4302 | 71638 | 0.7535 | - |
0.4352 | 72471 | 0.756 | - |
0.4402 | 73304 | 0.7633 | - |
0.4452 | 74137 | 0.7509 | - |
0.4502 | 74970 | 0.7547 | - |
0.4552 | 75803 | 0.7539 | - |
0.4602 | 76636 | 0.7608 | - |
0.4652 | 77469 | 0.8262 | - |
0.4702 | 78302 | 0.8076 | - |
0.4752 | 79135 | 0.8179 | - |
0.4802 | 79968 | 0.7709 | - |
0.4852 | 80801 | 0.744 | - |
0.4902 | 81634 | 0.7846 | - |
0.4952 | 82467 | 0.7473 | - |
0.5000 | 83270 | - | 0.4776 |
0.5002 | 83300 | 0.7759 | - |
0.5052 | 84133 | 0.755 | - |
0.5102 | 84966 | 0.7308 | - |
0.5152 | 85799 | 0.7256 | - |
0.5202 | 86632 | 0.7703 | - |
0.5252 | 87465 | 0.7823 | - |
0.5302 | 88298 | 0.8109 | - |
0.5352 | 89131 | 0.7795 | - |
0.5402 | 89964 | 0.7833 | - |
0.5452 | 90797 | 0.7752 | - |
0.5502 | 91630 | 0.7975 | - |
0.5552 | 92463 | 0.7863 | - |
0.5602 | 93296 | 0.7337 | - |
0.5652 | 94129 | 0.7755 | - |
0.5702 | 94962 | 0.7928 | - |
0.5752 | 95795 | 0.7604 | - |
0.5802 | 96628 | 0.7983 | - |
0.5852 | 97461 | 0.7665 | - |
0.5902 | 98294 | 0.7749 | - |
0.5952 | 99127 | 0.7838 | - |
0.6000 | 99924 | - | 0.4669 |
0.6002 | 99960 | 0.7727 | - |
0.6052 | 100793 | 0.8049 | - |
0.6102 | 101626 | 0.7857 | - |
0.6152 | 102459 | 0.7622 | - |
0.6202 | 103292 | 0.8117 | - |
0.6252 | 104125 | 0.7711 | - |
0.6302 | 104958 | 0.7892 | - |
0.6353 | 105791 | 0.7938 | - |
0.6403 | 106624 | 0.728 | - |
0.6453 | 107457 | 0.7693 | - |
0.6503 | 108290 | 0.7875 | - |
0.6553 | 109123 | 0.7958 | - |
0.6603 | 109956 | 0.749 | - |
0.6653 | 110789 | 0.7788 | - |
0.6703 | 111622 | 0.7614 | - |
0.6753 | 112455 | 0.7577 | - |
0.6803 | 113288 | 0.7805 | - |
0.6853 | 114121 | 0.7677 | - |
0.6903 | 114954 | 0.7458 | - |
0.6953 | 115787 | 0.7962 | - |
0.7000 | 116578 | - | 0.4641 |
0.7003 | 116620 | 0.7275 | - |
0.7053 | 117453 | 0.7778 | - |
0.7103 | 118286 | 0.7885 | - |
0.7153 | 119119 | 0.8046 | - |
0.7203 | 119952 | 0.8222 | - |
0.7253 | 120785 | 0.7714 | - |
0.7303 | 121618 | 0.7983 | - |
0.7353 | 122451 | 0.7359 | - |
0.7403 | 123284 | 0.7618 | - |
0.7453 | 124117 | 0.783 | - |
0.7503 | 124950 | 0.763 | - |
0.7553 | 125783 | 0.809 | - |
0.7603 | 126616 | 0.794 | - |
0.7653 | 127449 | 0.7366 | - |
0.7703 | 128282 | 0.776 | - |
0.7753 | 129115 | 0.8053 | - |
0.7803 | 129948 | 0.7941 | - |
0.7853 | 130781 | 0.7722 | - |
0.7903 | 131614 | 0.7959 | - |
0.7953 | 132447 | 0.8061 | - |
0.8000 | 133232 | - | 0.4468 |
Framework Versions
- Python: 3.11.10
- Sentence Transformers: 3.3.0
- Transformers: 4.46.3
- PyTorch: 2.5.1+cu124
- Accelerate: 1.1.1
- Datasets: 3.1.0
- Tokenizers: 0.20.3
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 1
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for yklymchuk-rztk/e5-3-test2
Base model
intfloat/multilingual-e5-baseEvaluation results
- Dot Accuracy@1 on bm fullself-reported0.483
- Dot Accuracy@3 on bm fullself-reported0.648
- Dot Accuracy@5 on bm fullself-reported0.736
- Dot Accuracy@10 on bm fullself-reported0.817
- Dot Precision@1 on bm fullself-reported0.483
- Dot Precision@3 on bm fullself-reported0.489
- Dot Precision@5 on bm fullself-reported0.495
- Dot Precision@10 on bm fullself-reported0.487
- Dot Recall@1 on bm fullself-reported0.012
- Dot Recall@3 on bm fullself-reported0.036