SentenceTransformer based on neuralmind/bert-large-portuguese-cased

This is a sentence-transformers model finetuned from neuralmind/bert-large-portuguese-cased. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: neuralmind/bert-large-portuguese-cased
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("SenhorDasMoscas/acho-ptbr-e4-lr0.0001-08-01-2025")
# Run inference
sentences = [
    'cenoura organico onde encontrar',
    'hortifruti',
    'moda acessorio',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.9378
spearman_cosine 0.8476

Training Details

Training Dataset

Unnamed Dataset

  • Size: 19,697 training samples
  • Columns: text1, text2, and label
  • Approximate statistics based on the first 1000 samples:
    text1 text2 label
    type string string float
    details
    • min: 3 tokens
    • mean: 7.76 tokens
    • max: 16 tokens
    • min: 3 tokens
    • mean: 6.03 tokens
    • max: 11 tokens
    • min: 0.1
    • mean: 0.54
    • max: 1.0
  • Samples:
    text1 text2 label
    trufa rechear lindt doce chocolate 1.0
    ferramento dremel ferramenta equipamento 1.0
    melatonino 5 mg suplemento 1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 2,189 evaluation samples
  • Columns: text1, text2, and label
  • Approximate statistics based on the first 1000 samples:
    text1 text2 label
    type string string float
    details
    • min: 3 tokens
    • mean: 7.78 tokens
    • max: 19 tokens
    • min: 3 tokens
    • mean: 6.16 tokens
    • max: 11 tokens
    • min: 0.1
    • mean: 0.53
    • max: 1.0
  • Samples:
    text1 text2 label
    cadeira bar alto movel 1.0
    vela massagem erotico bebida alcoolico 0.1
    onde sorvete vegano comida rapido fastfood 1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 0.0001
  • weight_decay: 0.1
  • num_train_epochs: 4
  • warmup_ratio: 0.1
  • warmup_steps: 246
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0001
  • weight_decay: 0.1
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 246
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss eval-similarity_spearman_cosine
0.0081 5 0.1835 - -
0.0162 10 0.2161 - -
0.0244 15 0.1918 - -
0.0325 20 0.1722 - -
0.0406 25 0.1527 - -
0.0487 30 0.141 - -
0.0568 35 0.1279 - -
0.0649 40 0.1228 - -
0.0731 45 0.0986 - -
0.0812 50 0.0875 - -
0.0893 55 0.0818 - -
0.0974 60 0.0804 - -
0.1055 65 0.0779 - -
0.1136 70 0.0722 - -
0.1218 75 0.0791 - -
0.1299 80 0.0749 - -
0.1380 85 0.0548 - -
0.1461 90 0.0681 - -
0.1542 95 0.0643 - -
0.1623 100 0.0672 - -
0.1705 105 0.0615 - -
0.1786 110 0.0804 - -
0.1867 115 0.0703 - -
0.1948 120 0.0685 - -
0.2029 125 0.0573 - -
0.2110 130 0.0603 - -
0.2192 135 0.0583 - -
0.2273 140 0.0528 - -
0.2354 145 0.0628 - -
0.2435 150 0.0657 - -
0.2516 155 0.0489 - -
0.2597 160 0.0566 - -
0.2679 165 0.0557 - -
0.2760 170 0.0522 - -
0.2841 175 0.0602 - -
0.2922 180 0.0572 - -
0.3003 185 0.0546 - -
0.3084 190 0.0523 - -
0.3166 195 0.054 - -
0.3247 200 0.0451 - -
0.3328 205 0.044 - -
0.3409 210 0.0439 - -
0.3490 215 0.0706 - -
0.3571 220 0.0569 - -
0.3653 225 0.0512 - -
0.3734 230 0.064 - -
0.3815 235 0.0691 - -
0.3896 240 0.0531 - -
0.3977 245 0.0536 - -
0.4058 250 0.0369 - -
0.4140 255 0.0678 - -
0.4221 260 0.0517 - -
0.4302 265 0.0832 - -
0.4383 270 0.0529 - -
0.4464 275 0.0519 - -
0.4545 280 0.0692 - -
0.4627 285 0.0455 - -
0.4708 290 0.0733 - -
0.4789 295 0.0534 - -
0.4870 300 0.0345 - -
0.4951 305 0.0534 - -
0.5032 310 0.0384 - -
0.5114 315 0.0622 - -
0.5195 320 0.0477 - -
0.5276 325 0.0571 - -
0.5357 330 0.0481 - -
0.5438 335 0.0647 - -
0.5519 340 0.0606 - -
0.5601 345 0.0493 - -
0.5682 350 0.0536 - -
0.5763 355 0.0466 - -
0.5844 360 0.0747 - -
0.5925 365 0.0658 - -
0.6006 370 0.0456 - -
0.6088 375 0.0369 - -
0.6169 380 0.0563 - -
0.625 385 0.0751 - -
0.6331 390 0.0475 - -
0.6412 395 0.0654 - -
0.6494 400 0.0498 - -
0.6575 405 0.0625 - -
0.6656 410 0.0614 - -
0.6737 415 0.0403 - -
0.6818 420 0.0474 - -
0.6899 425 0.0483 - -
0.6981 430 0.057 - -
0.7062 435 0.0754 - -
0.7143 440 0.0608 - -
0.7224 445 0.0411 - -
0.7305 450 0.0586 - -
0.7386 455 0.0494 - -
0.7468 460 0.0422 - -
0.7549 465 0.0595 - -
0.7630 470 0.0605 - -
0.7711 475 0.0648 - -
0.7792 480 0.0559 - -
0.7873 485 0.0432 - -
0.7955 490 0.0587 - -
0.8036 495 0.0316 - -
0.8117 500 0.0363 - -
0.8198 505 0.053 - -
0.8279 510 0.0665 - -
0.8360 515 0.0237 - -
0.8442 520 0.0323 - -
0.8523 525 0.0651 - -
0.8604 530 0.0346 - -
0.8685 535 0.0537 - -
0.8766 540 0.0413 - -
0.8847 545 0.0406 - -
0.8929 550 0.0609 - -
0.9010 555 0.0304 - -
0.9091 560 0.0455 - -
0.9172 565 0.0382 - -
0.9253 570 0.0423 - -
0.9334 575 0.0479 - -
0.9416 580 0.0469 - -
0.9497 585 0.0519 - -
0.9578 590 0.047 - -
0.9659 595 0.0365 - -
0.9740 600 0.0365 - -
0.9821 605 0.036 - -
0.9903 610 0.0426 - -
0.9984 615 0.0571 - -
1.0065 620 0.0368 - -
1.0146 625 0.0347 - -
1.0227 630 0.0295 - -
1.0308 635 0.0487 - -
1.0390 640 0.038 - -
1.0471 645 0.0278 - -
1.0552 650 0.038 - -
1.0633 655 0.0185 - -
1.0714 660 0.0293 0.0346 0.8404
1.0795 665 0.034 - -
1.0877 670 0.0306 - -
1.0958 675 0.0237 - -
1.1039 680 0.0336 - -
1.1120 685 0.0318 - -
1.1201 690 0.0294 - -
1.1282 695 0.0257 - -
1.1364 700 0.0202 - -
1.1445 705 0.0407 - -
1.1526 710 0.0191 - -
1.1607 715 0.0557 - -
1.1688 720 0.0287 - -
1.1769 725 0.0284 - -
1.1851 730 0.0476 - -
1.1932 735 0.0306 - -
1.2013 740 0.0168 - -
1.2094 745 0.0338 - -
1.2175 750 0.0244 - -
1.2256 755 0.0384 - -
1.2338 760 0.0145 - -
1.2419 765 0.031 - -
1.25 770 0.024 - -
1.2581 775 0.0268 - -
1.2662 780 0.0293 - -
1.2744 785 0.0217 - -
1.2825 790 0.0323 - -
1.2906 795 0.0286 - -
1.2987 800 0.0325 - -
1.3068 805 0.04 - -
1.3149 810 0.0449 - -
1.3231 815 0.0302 - -
1.3312 820 0.033 - -
1.3393 825 0.022 - -
1.3474 830 0.0377 - -
1.3555 835 0.0301 - -
1.3636 840 0.0303 - -
1.3718 845 0.0396 - -
1.3799 850 0.0298 - -
1.3880 855 0.0192 - -
1.3961 860 0.0203 - -
1.4042 865 0.041 - -
1.4123 870 0.0271 - -
1.4205 875 0.0368 - -
1.4286 880 0.0383 - -
1.4367 885 0.0399 - -
1.4448 890 0.0364 - -
1.4529 895 0.0236 - -
1.4610 900 0.0279 - -
1.4692 905 0.0441 - -
1.4773 910 0.0235 - -
1.4854 915 0.0248 - -
1.4935 920 0.0329 - -
1.5016 925 0.0306 - -
1.5097 930 0.0278 - -
1.5179 935 0.0303 - -
1.5260 940 0.0384 - -
1.5341 945 0.0364 - -
1.5422 950 0.0334 - -
1.5503 955 0.0317 - -
1.5584 960 0.0199 - -
1.5666 965 0.0232 - -
1.5747 970 0.0249 - -
1.5828 975 0.023 - -
1.5909 980 0.022 - -
1.5990 985 0.0131 - -
1.6071 990 0.023 - -
1.6153 995 0.0323 - -
1.6234 1000 0.0224 - -
1.6315 1005 0.022 - -
1.6396 1010 0.0266 - -
1.6477 1015 0.027 - -
1.6558 1020 0.0316 - -
1.6640 1025 0.0242 - -
1.6721 1030 0.0267 - -
1.6802 1035 0.0303 - -
1.6883 1040 0.0163 - -
1.6964 1045 0.0221 - -
1.7045 1050 0.0308 - -
1.7127 1055 0.0183 - -
1.7208 1060 0.0143 - -
1.7289 1065 0.0186 - -
1.7370 1070 0.0172 - -
1.7451 1075 0.0224 - -
1.7532 1080 0.0357 - -
1.7614 1085 0.0192 - -
1.7695 1090 0.0264 - -
1.7776 1095 0.02 - -
1.7857 1100 0.021 - -
1.7938 1105 0.0292 - -
1.8019 1110 0.0246 - -
1.8101 1115 0.0144 - -
1.8182 1120 0.0377 - -
1.8263 1125 0.0267 - -
1.8344 1130 0.0275 - -
1.8425 1135 0.0279 - -
1.8506 1140 0.0242 - -
1.8588 1145 0.0235 - -
1.8669 1150 0.0302 - -
1.875 1155 0.025 - -
1.8831 1160 0.037 - -
1.8912 1165 0.0244 - -
1.8994 1170 0.0361 - -
1.9075 1175 0.0255 - -
1.9156 1180 0.0381 - -
1.9237 1185 0.0224 - -
1.9318 1190 0.0331 - -
1.9399 1195 0.0201 - -
1.9481 1200 0.0353 - -
1.9562 1205 0.0337 - -
1.9643 1210 0.0135 - -
1.9724 1215 0.0238 - -
1.9805 1220 0.0346 - -
1.9886 1225 0.043 - -
1.9968 1230 0.0231 - -
2.0049 1235 0.0185 - -
2.0130 1240 0.0142 - -
2.0211 1245 0.0144 - -
2.0292 1250 0.0168 - -
2.0373 1255 0.0114 - -
2.0455 1260 0.0109 - -
2.0536 1265 0.0221 - -
2.0617 1270 0.0158 - -
2.0698 1275 0.0297 - -
2.0779 1280 0.0214 - -
2.0860 1285 0.0108 - -
2.0942 1290 0.0194 - -
2.1023 1295 0.0164 - -
2.1104 1300 0.0199 - -
2.1185 1305 0.0147 - -
2.1266 1310 0.0132 - -
2.1347 1315 0.0205 - -
2.1429 1320 0.0152 0.0286 0.8450
2.1510 1325 0.02 - -
2.1591 1330 0.0123 - -
2.1672 1335 0.0233 - -
2.1753 1340 0.0125 - -
2.1834 1345 0.0108 - -
2.1916 1350 0.0112 - -
2.1997 1355 0.0205 - -
2.2078 1360 0.022 - -
2.2159 1365 0.0144 - -
2.2240 1370 0.0183 - -
2.2321 1375 0.01 - -
2.2403 1380 0.0106 - -
2.2484 1385 0.02 - -
2.2565 1390 0.0116 - -
2.2646 1395 0.0129 - -
2.2727 1400 0.014 - -
2.2808 1405 0.0178 - -
2.2890 1410 0.0111 - -
2.2971 1415 0.0164 - -
2.3052 1420 0.018 - -
2.3133 1425 0.012 - -
2.3214 1430 0.011 - -
2.3295 1435 0.017 - -
2.3377 1440 0.0106 - -
2.3458 1445 0.012 - -
2.3539 1450 0.0123 - -
2.3620 1455 0.0186 - -
2.3701 1460 0.0135 - -
2.3782 1465 0.0125 - -
2.3864 1470 0.0216 - -
2.3945 1475 0.0132 - -
2.4026 1480 0.0184 - -
2.4107 1485 0.0214 - -
2.4188 1490 0.0255 - -
2.4269 1495 0.026 - -
2.4351 1500 0.0241 - -
2.4432 1505 0.0181 - -
2.4513 1510 0.0182 - -
2.4594 1515 0.0156 - -
2.4675 1520 0.0182 - -
2.4756 1525 0.0104 - -
2.4838 1530 0.0141 - -
2.4919 1535 0.0206 - -
2.5 1540 0.0119 - -
2.5081 1545 0.0152 - -
2.5162 1550 0.0132 - -
2.5244 1555 0.0188 - -
2.5325 1560 0.0217 - -
2.5406 1565 0.0179 - -
2.5487 1570 0.0163 - -
2.5568 1575 0.0334 - -
2.5649 1580 0.0082 - -
2.5731 1585 0.0118 - -
2.5812 1590 0.0131 - -
2.5893 1595 0.0178 - -
2.5974 1600 0.0172 - -
2.6055 1605 0.0065 - -
2.6136 1610 0.0147 - -
2.6218 1615 0.0266 - -
2.6299 1620 0.0134 - -
2.6380 1625 0.0213 - -
2.6461 1630 0.0184 - -
2.6542 1635 0.0221 - -
2.6623 1640 0.0088 - -
2.6705 1645 0.0172 - -
2.6786 1650 0.0094 - -
2.6867 1655 0.0109 - -
2.6948 1660 0.0114 - -
2.7029 1665 0.0119 - -
2.7110 1670 0.0119 - -
2.7192 1675 0.0143 - -
2.7273 1680 0.0124 - -
2.7354 1685 0.0184 - -
2.7435 1690 0.0204 - -
2.7516 1695 0.0091 - -
2.7597 1700 0.0118 - -
2.7679 1705 0.0115 - -
2.7760 1710 0.0203 - -
2.7841 1715 0.0102 - -
2.7922 1720 0.0159 - -
2.8003 1725 0.014 - -
2.8084 1730 0.0244 - -
2.8166 1735 0.0208 - -
2.8247 1740 0.0158 - -
2.8328 1745 0.0156 - -
2.8409 1750 0.008 - -
2.8490 1755 0.0142 - -
2.8571 1760 0.014 - -
2.8653 1765 0.0136 - -
2.8734 1770 0.0194 - -
2.8815 1775 0.018 - -
2.8896 1780 0.0159 - -
2.8977 1785 0.0202 - -
2.9058 1790 0.0103 - -
2.9140 1795 0.0134 - -
2.9221 1800 0.0194 - -
2.9302 1805 0.0168 - -
2.9383 1810 0.0162 - -
2.9464 1815 0.0233 - -
2.9545 1820 0.0146 - -
2.9627 1825 0.0151 - -
2.9708 1830 0.0256 - -
2.9789 1835 0.0243 - -
2.9870 1840 0.0126 - -
2.9951 1845 0.0186 - -
3.0032 1850 0.0202 - -
3.0114 1855 0.0071 - -
3.0195 1860 0.0037 - -
3.0276 1865 0.0029 - -
3.0357 1870 0.0063 - -
3.0438 1875 0.0085 - -
3.0519 1880 0.0112 - -
3.0601 1885 0.0151 - -
3.0682 1890 0.0099 - -
3.0763 1895 0.0108 - -
3.0844 1900 0.0107 - -
3.0925 1905 0.0025 - -
3.1006 1910 0.0159 - -
3.1088 1915 0.0038 - -
3.1169 1920 0.0104 - -
3.125 1925 0.0118 - -
3.1331 1930 0.0086 - -
3.1412 1935 0.0054 - -
3.1494 1940 0.0138 - -
3.1575 1945 0.0111 - -
3.1656 1950 0.0143 - -
3.1737 1955 0.0082 - -
3.1818 1960 0.0122 - -
3.1899 1965 0.0063 - -
3.1981 1970 0.0124 - -
3.2062 1975 0.0113 - -
3.2143 1980 0.0091 0.025 0.8476
3.2224 1985 0.0077 - -
3.2305 1990 0.0058 - -
3.2386 1995 0.0087 - -
3.2468 2000 0.0042 - -
3.2549 2005 0.0149 - -
3.2630 2010 0.0071 - -
3.2711 2015 0.0121 - -
3.2792 2020 0.0128 - -
3.2873 2025 0.0088 - -
3.2955 2030 0.0114 - -
3.3036 2035 0.0097 - -
3.3117 2040 0.0139 - -
3.3198 2045 0.0074 - -
3.3279 2050 0.0065 - -
3.3360 2055 0.0146 - -
3.3442 2060 0.0088 - -
3.3523 2065 0.0032 - -
3.3604 2070 0.0176 - -
3.3685 2075 0.0129 - -
3.3766 2080 0.0092 - -
3.3847 2085 0.0068 - -
3.3929 2090 0.0099 - -
3.4010 2095 0.01 - -
3.4091 2100 0.0082 - -
3.4172 2105 0.0086 - -
3.4253 2110 0.0036 - -
3.4334 2115 0.0169 - -
3.4416 2120 0.003 - -
3.4497 2125 0.0055 - -
3.4578 2130 0.0114 - -
3.4659 2135 0.0099 - -
3.4740 2140 0.0139 - -
3.4821 2145 0.008 - -
3.4903 2150 0.0094 - -
3.4984 2155 0.0096 - -
3.5065 2160 0.0069 - -
3.5146 2165 0.0035 - -
3.5227 2170 0.008 - -
3.5308 2175 0.0071 - -
3.5390 2180 0.0067 - -
3.5471 2185 0.0181 - -
3.5552 2190 0.0062 - -
3.5633 2195 0.0086 - -
3.5714 2200 0.0037 - -
3.5795 2205 0.0103 - -
3.5877 2210 0.0032 - -
3.5958 2215 0.004 - -
3.6039 2220 0.0032 - -
3.6120 2225 0.0051 - -
3.6201 2230 0.0068 - -
3.6282 2235 0.0171 - -
3.6364 2240 0.0094 - -
3.6445 2245 0.0039 - -
3.6526 2250 0.0088 - -
3.6607 2255 0.0021 - -
3.6688 2260 0.0174 - -
3.6769 2265 0.0108 - -
3.6851 2270 0.0071 - -
3.6932 2275 0.0189 - -
3.7013 2280 0.0027 - -
3.7094 2285 0.0086 - -
3.7175 2290 0.0089 - -
3.7256 2295 0.0195 - -
3.7338 2300 0.0032 - -
3.7419 2305 0.0153 - -
3.75 2310 0.0115 - -
3.7581 2315 0.0041 - -
3.7662 2320 0.0095 - -
3.7744 2325 0.0067 - -
3.7825 2330 0.0102 - -
3.7906 2335 0.0056 - -
3.7987 2340 0.0076 - -
3.8068 2345 0.0143 - -
3.8149 2350 0.0104 - -
3.8231 2355 0.0162 - -
3.8312 2360 0.0109 - -
3.8393 2365 0.0098 - -
3.8474 2370 0.0081 - -
3.8555 2375 0.0149 - -
3.8636 2380 0.0088 - -
3.8718 2385 0.0166 - -
3.8799 2390 0.0087 - -
3.8880 2395 0.0108 - -
3.8961 2400 0.0051 - -
3.9042 2405 0.0081 - -
3.9123 2410 0.0055 - -
3.9205 2415 0.0069 - -
3.9286 2420 0.0059 - -
3.9367 2425 0.0145 - -
3.9448 2430 0.005 - -
3.9529 2435 0.0091 - -
3.9610 2440 0.0071 - -
3.9692 2445 0.0157 - -
3.9773 2450 0.0039 - -
3.9854 2455 0.0089 - -
3.9935 2460 0.0061 - -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 2.14.4
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
63
Safetensors
Model size
334M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for SenhorDasMoscas/acho-ptbr-e4-lr0.0001-08-01-2025

Finetuned
(37)
this model

Evaluation results