modernbert-embed-base trained on triplets

This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: nomic-ai/modernbert-embed-base
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Free-Law-Project/modernbert-embed-base_finetune_8192")
# Run inference
sentences = [
    "Welsh, J. This is an action alleging negligence in the operation of a motor vehicle. The case was tried before a jury. A verdict was returned indicating that the defendant was not negligent The issue on appeal is whether the judge erred in failing to instruct the jury in accordance with G. L. c. 89, § 8, ( the general “ right of way ” at intersections ) as well as G. L. c. 89, § 9 ( the duty of a motorist at an intersection governed by a stop sign ). We determine there was no error. The following evidence was adduced at trial. On January 9, 1996, the plaintiff was operating a motor vehicle on Revere Street a public way in Quincy. She testified that she came to a complete stop at a “ stop ” sign at the intersection of Revere Street and Mechanic Street also a public way. A large mound of snow obstructed her view and she was unable to see the intersection. She proceeded out into the intersection and stopped again about half way into the intersection. The passable roadway was narrowed considerably due to the snow banks on the sides of the road. She allowed a white car to pass her and then started up again. She testified that she saw the car operated by the defendant approaching at a speed of 45 miles per hour ; nevertheless she proceeded through the intersection, making a left turn in the path of the oncoming vehicle. The defendant ’ s vehicle struck the left side of the plaintiffs vehicle, with left hand side damage to the defendant ' s vehicle. The defendant testified that the plaintiff did not stop. The jury determined that the defendant was not negligent The court gave comprehensive instructions on the elements of negligence and the duty of care. The court specifically instructed the jury as to the issue of violation of a statute as evidence of negligence, taking pains to explain that the violation, if found, must be a contributing factor to the damage sustained by the plaintiff. See Minnehan v. Hiland, 278 Mass. 518, 523 ( 1932 ). He specifically charged as to the duty to stop at a stop sign as provided by G. L. c. 89, § 9. 2 The plaintiff ’ s quarrel with the judge is that he failed specifically to instruct as she requested regarding G. L. c. 89, § 8, the general duty of care applicable when two motorists arrive at an intersection at approximately the same time. There was no error. G. L. c. 89, § 8 expressly provides that its provisions do not * 138apply when an operator is otherwise directed by a traffic regulatory sign erected and maintained in accordance with the provision of Sec. 2 of Ch. 85 ( which would include “ stop ” signs ). See Canane v. Dandini, 355 Mass. 72, 75 ( 1968 ). G. L. c. 89, § 9 is the statute that is primarily applicable to intersections governed by stop signs. As stated in Canane, one directed to stop by a stop sign may not have the benefit of the general rule if the rule grants him the right of way, until he has complied with the order to stop. After stopping, the operator becomes subject to the general rule and may proceed and thereafter exercise the right of way in accordance with that rule. Id. at 75. However, the operator must proceed into the intersection with due care. Even if the operator has the right of way under c. 89, § 8, that right is subject to the requirement of using due care. Possession of the right of way is only one factor to be considered in deciding whether the operator has fulfilled his duty of due care. Id. at 76. Accordingly, an operator who has stopped at a “ stop ” sign may still be found to be negligent if he proceeds into the intersection without using due care. The duty to exercise due care requires an operator who has halted at a stop sign to behave with reasonable caution before entering the intersection. Even an operator who has stopped at a stop sign and has a “ right of way ” under § 8 may be found to be negligent if he proceeds into the intersection before he can do so with reasonable prudence and with suitable regard for his safety and that of others. Freyermuth v. Lutfy, 376 Mass., 612, 616, N. 3. ( 1978 ). Again, the “ right of way ^ rule in § 8 is not absolute, but is subject to the condition of due care as to its exercise. With these principles in mind, we turn to the judge ’ s charge. At the outset, we observe that it is not required that the judge charge the jury in the precise formulation proposed [ see Poole v. Boston & Main Ry., 216 Mass. 12, 15 ( 1913 ) ] so long as the judge fairly and adequately covers the point in the charge. See Comeau v. Beck, 319 Mass. 17, 10 ( 1946 ) ; Squires v. Fraska, 301 Mass. 474, 476 ( 1938 ). Stated somewhat differently, the denial of requested instruction does not constitute error where the requested instructions were covered substantially in the charge. Pearlin v. Farrell, 356 Mass. 741 ( 1970 ). The judge gave detailed and comprehensive instructions on the concept of negligence in the context of operating of motor vehicles. He explained the duty of a motorist with regard to intersections controlled by stop signs. This explanation included the duty to yield to vehicles in or in close proximity to the intersection. While the instruction did not follow precisely the formulation suggested in the Canane and Freyermuth cases, the judge ’ s instruction properly stressed the duty of due care when proceeding into the intersection governed by the stop sign after having stopped. Appeal dismissed. So ordered. “ Another rule of the road is that every driver approaching a stop sign, shall stop at a clearly marked stop line, and if there is not a stop line, then [ at ] a point nearest the intersecting roadway before entering it After having stopped, the driver shall yield the right of way to every vehicle in the intersection or approaching in [ the ] other roadway so closely as to constitute an immediate hazard during the time when the driver is moving across or within the intersection. ”",
    'What is the legal duty of care for drivers at intersections with stop signs?',
    'What are the legal requirements for establishing a valid contract in business law?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 1.0

Triplet

Metric Value
cosine_accuracy 1.0

Training Details

Training Dataset

Free-Law-Project/opinions-synthetic-query-8192

  • Size: 351 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 351 samples:
    anchor positive negative
    type string string string
    details
    • min: 62 tokens
    • mean: 2810.15 tokens
    • max: 7455 tokens
    • min: 12 tokens
    • mean: 18.93 tokens
    • max: 31 tokens
    • min: 11 tokens
    • mean: 14.86 tokens
    • max: 21 tokens
  • Samples:
    anchor positive negative
    DISTRICT COURT OF APPEAL OF THE STATE OF FLORIDA FOURTH DISTRICT EURICE McGILL, Appellant, v. STATE OF FLORIDA, Appellee. No. 4D17 - 1492 [ August 31, 2017 ] Appeal of order denying rule 3. 850 motion from the Circuit Court for the Seventeenth Judicial Circuit, Broward County ; Paul L. Backman, Judge ; L. T. Case No. 10 - 12523CF10A. Eurice McGill, Lake City, pro se. No appearance required for appellee. PER CURIAM. Affirmed. WARNER, DAMOORGIAN and KUNTZ, JJ., concur. * * * Not final until disposition of timely filed motion for rehearing. What are the grounds for denying a Rule 3.850 motion in Florida courts? What are the qualifications to file for an eviction in Florida?
    Twersky v Incorporated Vil. of Great Neck ( 2015 NY Slip Op 02755 ) Twersky v Incorporated Vil. of Great Neck 2015 NY Slip Op 02755 Decided on April 1, 2015 Appellate Division, Second Department Published by New York State Law Reporting Bureau pursuant to Judiciary Law § 431. This opinion is uncorrected and subject to revision before publication in the Official Reports. Decided on April 1, 2015 SUPREME COURT OF THE STATE OF NEW YORK Appellate Division, Second Judicial Department RANDALL T. ENG, P. J. LEONARD B. AUSTIN JEFFREY A. COHEN BETSY BARROS, JJ. 2014 - 07552 ( Index No. 9576 / 12 ) [ * 1 ] Sharon Twersky, respondent, v Incorporated Village of Great Neck, et al., defendants, FHM Mortgage Corp., et al., appellants. Cascone & Kluepfel, LLP, Garden City, N. Y. ( Howard B. Altman of counsel ), for appellants. Isaacson, Schiowitz & Korson, LLP, Rockville Centre, N. Y. ( Jeremy Schiowitz of counsel ), for respondent. DECISION & ORDER In an action to recover damages for personal injurie... What legal principles determine a property owner's duty to maintain safe conditions for pedestrians? What are the tax implications of selling a property in New York State?
    951 A. 2d 180 ( 2008 ) Philip S. HORNER v. GOVERNOR, State of New Hampshire and another. No. 2007 - 668. Supreme Court of New Hampshire. Argued March 27, 2008. Opinion Issued : June 19, 2008. * 181 Philip S. Horner, pro se, and Richard E. Samdperil, of Exeter ( Mr. Horner on the brief, and Mr. Samdperil orally ), for the plaintiff. Kelly A. Ayotte, attorney general ( Karen A. Schlitzer, assistant attorney general, on the memorandum of law and orally ), for the defendants. BRODERICK, C. J. The plaintiff, Philip S. Horner, appeals an order of the Superior Court ( Smukler, * 182 J. ) denying his petition for a writ of prohibition to enjoin the State from enforcing RSA 651 - B : 11 ( 2007 & Supp. 2007 ), which mandates the collection of a sex offender registration fee. We affirm. The plaintiff was convicted in 2000 of five counts of felonious sexual assault, see RSA 632 - A : 3 ( 2007 ). Every sex offender and offender against children is required to register with the New Hampshire Divisio... What determines whether a charge is classified as a tax or a fee under New Hampshire law? What are the tax implications of forming a non-profit organization in the United States?
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Free-Law-Project/opinions-synthetic-query-8192

  • Size: 95 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 95 samples:
    anchor positive negative
    type string string string
    details
    • min: 73 tokens
    • mean: 1723.31 tokens
    • max: 7494 tokens
    • min: 13 tokens
    • mean: 18.89 tokens
    • max: 26 tokens
    • min: 11 tokens
    • mean: 14.46 tokens
    • max: 20 tokens
  • Samples:
    anchor positive negative
    Mr. Justice Mercur delivered the opinion of the court, November 20th 1882. Both parties claim title to this land under sheriff ’ s sale as the property of James Strouss. The defendant purchased at a sale made in December 1815, the plaintiff at one made in March 1880. The plaintiff seeks to impeach the validity of the first sale * 411on the ground that it was made in fraud of the creditors of Strouss. The law presumes that a public judicial sale is made in good faith. This presumption stands, unless overthrown by clear and satisfactory evidence of fraud or unfair means. The contention was one of fact. Much evidence Avas given bearing on the question, and some of it conflicting. The learned judge submitted the case to the jury in a clear and correct charge. He instructed them that if the sheriff ’ s sale was made with the intention of hindering, delaying or defeating creditors, and the purchaser had knowledge of such, it was null and void, although the full value of the property may have... What are the legal principles governing fraud and sale validity in sheriff's sales? What are the legal implications of intellectual property infringement?
    217 N. J. Super. 541 ( 1987 ) 526 A. 2d 290 ALAN C. STAVER, PLAINTIFF, v. MARGARET STAVER, DEFENDANT. Superior Court of New Jersey, Chancery Division Bergen County, Family Part. March 11, 1987. * 543 Donald L. Garber for plaintiff ( Donald L. Garber, attorney ; Michael I. Lubin on the brief ). John Fiorello for defendant ( Feldman, Feldman, Hoffman & Fiorello, attorneys ). SIMON, MARGUERITE T., J. S. C. Plaintiff husband brings this motion seeking to terminate his obligation to pay alimony to defendant pursuant to a judgment of divorce entered September 6, 1974. Defendant wife brings a cross - motion for enforcement of the judgment. At the time of the entry of the final judgment, plaintiff was employed as an ordained minister earning approximately $ 12, 000 a year. The parties entered into a consensual agreement which was incorporated into the judgment. Two pertinent stipulations of the agreement are as follows : ( 1 ) " Said alimony of $ 500 per month shall continue in effect regardle... Can pension benefits accrued after a divorce be considered as income for modifying alimony payments? What are the tax implications of forming a limited liability company (LLC)?
    Howard, J. : By the ' will of Byron S. Briggs, which was offered for probate in the Surrogate ’ s Court of Madison county, Harriet 0. Briggs, his wife, was appointed executrix. After the surrogate had overruled certain objections to the probate of the will and announced his conclusion that the will should be admitted to probate, written objections were filed to the issuance of letters testamentary to the widow, on the ground that she had deliberately murdered the testator for the purpose of thwarting any attempt on his part to make another will. The objections were filed by the son of the testator ; and his attitude of opposition to the widow was approved by a granddaughter of the testator. These two persons were descendants of the testator by a former wife. They were legatees under the will and had a statutory right to make objections. ( See Code Civ. Proc. § 2636. ) They stood ready with the witnesses in court and offered to make proof of the serious charges which they had preferred ... Can someone accused of murdering a testator be appointed as an executor of the will? What are the tax implications for inheriting property in the United States?
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • learning_rate: 2e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Validation Loss dev_cosine_accuracy
-1 -1 - 0.9895
0.5682 100 0.0288 0.9895
1.1364 200 0.0317 1.0
1.7045 300 0.0166 1.0
-1 -1 - 1.0

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.49.0
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.4.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
14
Safetensors
Model size
149M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Free-Law-Project/modernbert-embed-base_finetune_8192

Finetuned
(30)
this model

Collection including Free-Law-Project/modernbert-embed-base_finetune_8192

Evaluation results