|
--- |
|
tags: |
|
- sentence-transformers |
|
- sentence-similarity |
|
- feature-extraction |
|
- generated_from_trainer |
|
- dataset_size:798 |
|
- loss:MatryoshkaLoss |
|
- loss:MultipleNegativesRankingLoss |
|
base_model: Snowflake/snowflake-arctic-embed-m |
|
widget: |
|
- source_sentence: What is the definition of a sponsor-investigator according to the |
|
provided context? |
|
sentences: |
|
- '§ 312.47 Meetings. |
|
|
|
(a) General. Meetings between a sponsor and the agency are frequently useful in |
|
resolving questions and |
|
|
|
issues raised during the course of a clinical investigation. FDA encourages such |
|
meetings to the extent |
|
|
|
that they aid in the evaluation of the drug and in the solution of scientific |
|
problems concerning the drug, to |
|
|
|
the extent that FDA''s resources permit. The general principle underlying the |
|
conduct of such meetings is' |
|
- 'employees to conduct an investigation that it has initiated is a sponsor, not |
|
a sponsor-investigator, and |
|
|
|
the employees are investigators. |
|
|
|
Sponsor-Investigator means an individual who both initiates and conducts an investigation, |
|
and under whose |
|
|
|
immediate direction the investigational drug is administered or dispensed. The |
|
term does not include any |
|
|
|
person other than an individual. The requirements applicable to a sponsor-investigator |
|
under this part' |
|
- 'practice regulations in part 58, or, if the study was not conducted in compliance |
|
with those |
|
|
|
regulations, a brief statement of the reason for the noncompliance. |
|
|
|
(9) Previous human experience with the investigational drug. A summary of previous |
|
human experience |
|
|
|
known to the applicant, if any, with the investigational drug. The information |
|
is required to include |
|
|
|
the following: |
|
|
|
(i) If the investigational drug has been investigated or marketed previously, |
|
either in the United' |
|
- source_sentence: What is the primary purpose of Phase 1 studies in drug development? |
|
sentences: |
|
- '§ 312.53 Selecting investigators and monitors. |
|
|
|
§ 312.54 Emergency research under § 50.24 of this chapter. |
|
|
|
§ 312.55 Informing investigators. |
|
|
|
This content is from the eCFR and is authoritative but unofficial. |
|
|
|
21 CFR Part 312 (up to date as of 1/23/2025) |
|
|
|
Investigational New Drug Application 21 CFR Part 312 (Jan. 23, 2025) |
|
|
|
21 CFR Part 312 (Jan. 23, 2025) (enhanced display) page 1 of 54' |
|
- 'relevant to the safety of the drug as are required under § 312.32. The sponsor |
|
shall make annual reports |
|
|
|
on the progress of the investigation in accordance with § 312.33. |
|
|
|
(d) A sponsor who determines that its investigational drug presents an unreasonable |
|
and significant risk to |
|
|
|
subjects shall discontinue those investigations that present the risk, notify |
|
FDA, all institutional review |
|
|
|
boards, and all investigators who have at any time participated in the investigation |
|
of the discontinuance,' |
|
- 'are typically closely monitored and may be conducted in patients or normal volunteer |
|
subjects. |
|
|
|
These studies are designed to determine the metabolism and pharmacologic actions |
|
of the drug in |
|
|
|
humans, the side effects associated with increasing doses, and, if possible, to |
|
gain early evidence on |
|
|
|
effectiveness. During Phase 1, sufficient information about the drug''s pharmacokinetics |
|
and |
|
|
|
pharmacological effects should be obtained to permit the design of well-controlled, |
|
scientifically' |
|
- source_sentence: What is the required format for numbering submissions related to |
|
the investigation? |
|
sentences: |
|
- 'using a single, three-digit serial number. The initial IND is required to be |
|
numbered 000; each subsequent |
|
|
|
submission (e.g., amendment, report, or correspondence) is required to be numbered |
|
chronologically in |
|
|
|
sequence. |
|
|
|
(f) Identification of exception from informed consent. If the investigation involves |
|
an exception from informed |
|
|
|
consent under § 50.24 of this chapter, the sponsor shall prominently identify |
|
on the cover sheet that the' |
|
- 'response time, a sponsor may not proceed with a clinical trial on which a clinical |
|
hold has been imposed |
|
|
|
until the sponsor has been notified by FDA that the hold has been lifted. |
|
|
|
(f) Appeal. If the sponsor disagrees with the reasons cited for the clinical hold, |
|
the sponsor may request |
|
|
|
reconsideration of the decision in accordance with § 312.48. |
|
|
|
(g) Conversion of IND on clinical hold to inactive status. If all investigations |
|
covered by an IND remain on' |
|
- 'investigator, the sponsor of any investigation in which the investigator has |
|
been named as a participant, |
|
|
|
and the reviewing institutional review boards (IRBs) that the investigator is |
|
not eligible to receive test |
|
|
|
articles under this part. The notification to the investigator, sponsor, and IRBs |
|
will provide a statement of |
|
|
|
21 CFR Part 312 (up to date as of 1/23/2025) |
|
|
|
Investigational New Drug Application 21 CFR 312.66 |
|
|
|
21 CFR 312.70(b) (enhanced display) page 37 of 54' |
|
- source_sentence: What are the regions mentioned in the context where drugs can be |
|
exported? |
|
sentences: |
|
- 'Africa, or to any country in the European Union or the European Economic Area, |
|
and complies with |
|
|
|
the laws of the country to which it is being exported, the applicable provisions |
|
of section 802(c), (f), |
|
|
|
and (g) of the act, and § 1.101 of this chapter. Drugs exported under this paragraph |
|
that are not the |
|
|
|
subject of an IND are exempt from the label requirement in § 312.6(a); or |
|
|
|
(4) Except as provided in paragraph (b)(5) of this section, the person exporting |
|
the drug sends an email' |
|
- 'before its implementation. Protocol amendments to add a new investigator or to |
|
provide additional |
|
|
|
information about investigators may be grouped and submitted at 30-day intervals. |
|
When several |
|
|
|
submissions of new protocols or protocol changes are anticipated during a short |
|
period, the sponsor is |
|
|
|
encouraged, to the extent feasible, to include these all in a single submission. |
|
|
|
21 CFR Part 312 (up to date as of 1/23/2025) |
|
|
|
Investigational New Drug Application 21 CFR 312.30(b)(2)(i)(b)' |
|
- 'that apply to specific types of expanded access are described in §§ 312.310 through |
|
312.320. |
|
|
|
(a) Scope. This subpart contains the requirements for the use of investigational |
|
new drugs and approved |
|
|
|
drugs where availability is limited by a risk evaluation and mitigation strategy |
|
(REMS) when the primary |
|
|
|
purpose is to diagnose, monitor, or treat a patient''s disease or condition. The |
|
aim of this subpart is to' |
|
- source_sentence: What regulatory framework does 21 CFR Part 312 pertain to as of |
|
January 23, 2025? |
|
sentences: |
|
- 'risk-benefit judgment in making the final decision on approvability. As part |
|
of this evaluation, consistent |
|
|
|
with the statement of purpose in § 312.80, FDA will consider whether the benefits |
|
of the drug outweigh |
|
|
|
the known and potential risks of the drug and the need to answer remaining questions |
|
about risks and |
|
|
|
benefits of the drug, taking into consideration the severity of the disease and |
|
the absence of satisfactory |
|
|
|
alternative therapy.' |
|
- 'provide for disposition of the unused supplies of the drug under § 312.59. |
|
|
|
(b) Case histories. An investigator is required to prepare and maintain adequate |
|
and accurate case histories |
|
|
|
that record all observations and other data pertinent to the investigation on |
|
each individual administered |
|
|
|
the investigational drug or employed as a control in the investigation. Case histories |
|
include the case |
|
|
|
report forms and supporting data including, for example, signed and dated consent |
|
forms and medical' |
|
- '§ 312.315 Intermediate-size patient populations. |
|
|
|
21 CFR Part 312 (up to date as of 1/23/2025) |
|
|
|
Investigational New Drug Application 21 CFR Part 312 (Jan. 23, 2025) |
|
|
|
21 CFR Part 312 (Jan. 23, 2025) (enhanced display) page 2 of 54' |
|
pipeline_tag: sentence-similarity |
|
library_name: sentence-transformers |
|
metrics: |
|
- cosine_accuracy@1 |
|
- cosine_accuracy@3 |
|
- cosine_accuracy@5 |
|
- cosine_accuracy@10 |
|
- cosine_precision@1 |
|
- cosine_precision@3 |
|
- cosine_precision@5 |
|
- cosine_precision@10 |
|
- cosine_recall@1 |
|
- cosine_recall@3 |
|
- cosine_recall@5 |
|
- cosine_recall@10 |
|
- cosine_ndcg@10 |
|
- cosine_mrr@10 |
|
- cosine_map@100 |
|
model-index: |
|
- name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-m |
|
results: |
|
- task: |
|
type: information-retrieval |
|
name: Information Retrieval |
|
dataset: |
|
name: Unknown |
|
type: unknown |
|
metrics: |
|
- type: cosine_accuracy@1 |
|
value: 0.92 |
|
name: Cosine Accuracy@1 |
|
- type: cosine_accuracy@3 |
|
value: 0.99 |
|
name: Cosine Accuracy@3 |
|
- type: cosine_accuracy@5 |
|
value: 0.99 |
|
name: Cosine Accuracy@5 |
|
- type: cosine_accuracy@10 |
|
value: 1.0 |
|
name: Cosine Accuracy@10 |
|
- type: cosine_precision@1 |
|
value: 0.92 |
|
name: Cosine Precision@1 |
|
- type: cosine_precision@3 |
|
value: 0.33000000000000007 |
|
name: Cosine Precision@3 |
|
- type: cosine_precision@5 |
|
value: 0.19799999999999998 |
|
name: Cosine Precision@5 |
|
- type: cosine_precision@10 |
|
value: 0.09999999999999998 |
|
name: Cosine Precision@10 |
|
- type: cosine_recall@1 |
|
value: 0.92 |
|
name: Cosine Recall@1 |
|
- type: cosine_recall@3 |
|
value: 0.99 |
|
name: Cosine Recall@3 |
|
- type: cosine_recall@5 |
|
value: 0.99 |
|
name: Cosine Recall@5 |
|
- type: cosine_recall@10 |
|
value: 1.0 |
|
name: Cosine Recall@10 |
|
- type: cosine_ndcg@10 |
|
value: 0.9637992620139386 |
|
name: Cosine Ndcg@10 |
|
- type: cosine_mrr@10 |
|
value: 0.9516666666666665 |
|
name: Cosine Mrr@10 |
|
- type: cosine_map@100 |
|
value: 0.9516666666666667 |
|
name: Cosine Map@100 |
|
--- |
|
|
|
# SentenceTransformer based on Snowflake/snowflake-arctic-embed-m |
|
|
|
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Snowflake/snowflake-arctic-embed-m](https://huggingface.co/Snowflake/snowflake-arctic-embed-m). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
- **Model Type:** Sentence Transformer |
|
- **Base model:** [Snowflake/snowflake-arctic-embed-m](https://huggingface.co/Snowflake/snowflake-arctic-embed-m) <!-- at revision fc74610d18462d218e312aa986ec5c8a75a98152 --> |
|
- **Maximum Sequence Length:** 512 tokens |
|
- **Output Dimensionality:** 768 dimensions |
|
- **Similarity Function:** Cosine Similarity |
|
<!-- - **Training Dataset:** Unknown --> |
|
<!-- - **Language:** Unknown --> |
|
<!-- - **License:** Unknown --> |
|
|
|
### Model Sources |
|
|
|
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net) |
|
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) |
|
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) |
|
|
|
### Full Model Architecture |
|
|
|
``` |
|
SentenceTransformer( |
|
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel |
|
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) |
|
(2): Normalize() |
|
) |
|
``` |
|
|
|
## Usage |
|
|
|
### Direct Usage (Sentence Transformers) |
|
|
|
First install the Sentence Transformers library: |
|
|
|
```bash |
|
pip install -U sentence-transformers |
|
``` |
|
|
|
Then you can load this model and run inference. |
|
```python |
|
from sentence_transformers import SentenceTransformer |
|
|
|
# Download from the 🤗 Hub |
|
model = SentenceTransformer("philipk22/ind312-ft-v0") |
|
# Run inference |
|
sentences = [ |
|
'What regulatory framework does 21 CFR Part 312 pertain to as of January 23, 2025?', |
|
'§ 312.315 Intermediate-size patient populations.\n21 CFR Part 312 (up to date as of 1/23/2025)\nInvestigational New Drug Application 21 CFR Part 312 (Jan. 23, 2025)\n21 CFR Part 312 (Jan. 23, 2025) (enhanced display) page 2 of 54', |
|
'risk-benefit judgment in making the final decision on approvability. As part of this evaluation, consistent\nwith the statement of purpose in § 312.80, FDA will consider whether the benefits of the drug outweigh\nthe known and potential risks of the drug and the need to answer remaining questions about risks and\nbenefits of the drug, taking into consideration the severity of the disease and the absence of satisfactory\nalternative therapy.', |
|
] |
|
embeddings = model.encode(sentences) |
|
print(embeddings.shape) |
|
# [3, 768] |
|
|
|
# Get the similarity scores for the embeddings |
|
similarities = model.similarity(embeddings, embeddings) |
|
print(similarities.shape) |
|
# [3, 3] |
|
``` |
|
|
|
<!-- |
|
### Direct Usage (Transformers) |
|
|
|
<details><summary>Click to see the direct usage in Transformers</summary> |
|
|
|
</details> |
|
--> |
|
|
|
<!-- |
|
### Downstream Usage (Sentence Transformers) |
|
|
|
You can finetune this model on your own dataset. |
|
|
|
<details><summary>Click to expand</summary> |
|
|
|
</details> |
|
--> |
|
|
|
<!-- |
|
### Out-of-Scope Use |
|
|
|
*List how the model may foreseeably be misused and address what users ought not to do with the model.* |
|
--> |
|
|
|
## Evaluation |
|
|
|
### Metrics |
|
|
|
#### Information Retrieval |
|
|
|
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) |
|
|
|
| Metric | Value | |
|
|:--------------------|:-----------| |
|
| cosine_accuracy@1 | 0.92 | |
|
| cosine_accuracy@3 | 0.99 | |
|
| cosine_accuracy@5 | 0.99 | |
|
| cosine_accuracy@10 | 1.0 | |
|
| cosine_precision@1 | 0.92 | |
|
| cosine_precision@3 | 0.33 | |
|
| cosine_precision@5 | 0.198 | |
|
| cosine_precision@10 | 0.1 | |
|
| cosine_recall@1 | 0.92 | |
|
| cosine_recall@3 | 0.99 | |
|
| cosine_recall@5 | 0.99 | |
|
| cosine_recall@10 | 1.0 | |
|
| **cosine_ndcg@10** | **0.9638** | |
|
| cosine_mrr@10 | 0.9517 | |
|
| cosine_map@100 | 0.9517 | |
|
|
|
<!-- |
|
## Bias, Risks and Limitations |
|
|
|
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* |
|
--> |
|
|
|
<!-- |
|
### Recommendations |
|
|
|
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* |
|
--> |
|
|
|
## Training Details |
|
|
|
### Training Dataset |
|
|
|
#### Unnamed Dataset |
|
|
|
* Size: 798 training samples |
|
* Columns: <code>sentence_0</code> and <code>sentence_1</code> |
|
* Approximate statistics based on the first 798 samples: |
|
| | sentence_0 | sentence_1 | |
|
|:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| |
|
| type | string | string | |
|
| details | <ul><li>min: 12 tokens</li><li>mean: 20.82 tokens</li><li>max: 46 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 93.06 tokens</li><li>max: 158 tokens</li></ul> | |
|
* Samples: |
|
| sentence_0 | sentence_1 | |
|
|:--------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
|
| <code>What is the scope of Part 312 in Title 21 regarding investigational new drug applications?</code> | <code>Title 21 —Food and Drugs<br>Chapter I —Food and Drug Administration, Department of Health and Human Services<br>Subchapter D —Drugs for Human Use<br>Part 312 Investigational New Drug Application<br>Subpart A General Provisions<br>§ 312.1 Scope.<br>§ 312.2 Applicability.<br>§ 312.3 Definitions and interpretations.<br>§ 312.6 Labeling of an investigational new drug.<br>§ 312.7 Promotion of investigational drugs.<br>§ 312.8 Charging for investigational drugs under an IND.<br>§ 312.10 Waivers.</code> | |
|
| <code>How does § 3126 address the labeling requirements for investigational new drugs?</code> | <code>Title 21 —Food and Drugs<br>Chapter I —Food and Drug Administration, Department of Health and Human Services<br>Subchapter D —Drugs for Human Use<br>Part 312 Investigational New Drug Application<br>Subpart A General Provisions<br>§ 312.1 Scope.<br>§ 312.2 Applicability.<br>§ 312.3 Definitions and interpretations.<br>§ 312.6 Labeling of an investigational new drug.<br>§ 312.7 Promotion of investigational drugs.<br>§ 312.8 Charging for investigational drugs under an IND.<br>§ 312.10 Waivers.</code> | |
|
| <code>What are the general principles outlined in § 31222 regarding the IND submission?</code> | <code>§ 312.10 Waivers.<br>Subpart B Investigational New Drug Application (IND)<br>§ 312.20 Requirement for an IND.<br>§ 312.21 Phases of an investigation.<br>§ 312.22 General principles of the IND submission.<br>§ 312.23 IND content and format.<br>§ 312.30 Protocol amendments.<br>§ 312.31 Information amendments.<br>§ 312.32 IND safety reporting.<br>§ 312.33 Annual reports.<br>§ 312.38 Withdrawal of an IND.<br>Subpart C Administrative Actions</code> | |
|
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters: |
|
```json |
|
{ |
|
"loss": "MultipleNegativesRankingLoss", |
|
"matryoshka_dims": [ |
|
768, |
|
512, |
|
256, |
|
128, |
|
64 |
|
], |
|
"matryoshka_weights": [ |
|
1, |
|
1, |
|
1, |
|
1, |
|
1 |
|
], |
|
"n_dims_per_step": -1 |
|
} |
|
``` |
|
|
|
### Training Hyperparameters |
|
#### Non-Default Hyperparameters |
|
|
|
- `eval_strategy`: steps |
|
- `per_device_train_batch_size`: 10 |
|
- `per_device_eval_batch_size`: 10 |
|
- `num_train_epochs`: 10 |
|
- `multi_dataset_batch_sampler`: round_robin |
|
|
|
#### All Hyperparameters |
|
<details><summary>Click to expand</summary> |
|
|
|
- `overwrite_output_dir`: False |
|
- `do_predict`: False |
|
- `eval_strategy`: steps |
|
- `prediction_loss_only`: True |
|
- `per_device_train_batch_size`: 10 |
|
- `per_device_eval_batch_size`: 10 |
|
- `per_gpu_train_batch_size`: None |
|
- `per_gpu_eval_batch_size`: None |
|
- `gradient_accumulation_steps`: 1 |
|
- `eval_accumulation_steps`: None |
|
- `torch_empty_cache_steps`: None |
|
- `learning_rate`: 5e-05 |
|
- `weight_decay`: 0.0 |
|
- `adam_beta1`: 0.9 |
|
- `adam_beta2`: 0.999 |
|
- `adam_epsilon`: 1e-08 |
|
- `max_grad_norm`: 1 |
|
- `num_train_epochs`: 10 |
|
- `max_steps`: -1 |
|
- `lr_scheduler_type`: linear |
|
- `lr_scheduler_kwargs`: {} |
|
- `warmup_ratio`: 0.0 |
|
- `warmup_steps`: 0 |
|
- `log_level`: passive |
|
- `log_level_replica`: warning |
|
- `log_on_each_node`: True |
|
- `logging_nan_inf_filter`: True |
|
- `save_safetensors`: True |
|
- `save_on_each_node`: False |
|
- `save_only_model`: False |
|
- `restore_callback_states_from_checkpoint`: False |
|
- `no_cuda`: False |
|
- `use_cpu`: False |
|
- `use_mps_device`: False |
|
- `seed`: 42 |
|
- `data_seed`: None |
|
- `jit_mode_eval`: False |
|
- `use_ipex`: False |
|
- `bf16`: False |
|
- `fp16`: False |
|
- `fp16_opt_level`: O1 |
|
- `half_precision_backend`: auto |
|
- `bf16_full_eval`: False |
|
- `fp16_full_eval`: False |
|
- `tf32`: None |
|
- `local_rank`: 0 |
|
- `ddp_backend`: None |
|
- `tpu_num_cores`: None |
|
- `tpu_metrics_debug`: False |
|
- `debug`: [] |
|
- `dataloader_drop_last`: False |
|
- `dataloader_num_workers`: 0 |
|
- `dataloader_prefetch_factor`: None |
|
- `past_index`: -1 |
|
- `disable_tqdm`: False |
|
- `remove_unused_columns`: True |
|
- `label_names`: None |
|
- `load_best_model_at_end`: False |
|
- `ignore_data_skip`: False |
|
- `fsdp`: [] |
|
- `fsdp_min_num_params`: 0 |
|
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} |
|
- `fsdp_transformer_layer_cls_to_wrap`: None |
|
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} |
|
- `deepspeed`: None |
|
- `label_smoothing_factor`: 0.0 |
|
- `optim`: adamw_torch |
|
- `optim_args`: None |
|
- `adafactor`: False |
|
- `group_by_length`: False |
|
- `length_column_name`: length |
|
- `ddp_find_unused_parameters`: None |
|
- `ddp_bucket_cap_mb`: None |
|
- `ddp_broadcast_buffers`: False |
|
- `dataloader_pin_memory`: True |
|
- `dataloader_persistent_workers`: False |
|
- `skip_memory_metrics`: True |
|
- `use_legacy_prediction_loop`: False |
|
- `push_to_hub`: False |
|
- `resume_from_checkpoint`: None |
|
- `hub_model_id`: None |
|
- `hub_strategy`: every_save |
|
- `hub_private_repo`: None |
|
- `hub_always_push`: False |
|
- `gradient_checkpointing`: False |
|
- `gradient_checkpointing_kwargs`: None |
|
- `include_inputs_for_metrics`: False |
|
- `include_for_metrics`: [] |
|
- `eval_do_concat_batches`: True |
|
- `fp16_backend`: auto |
|
- `push_to_hub_model_id`: None |
|
- `push_to_hub_organization`: None |
|
- `mp_parameters`: |
|
- `auto_find_batch_size`: False |
|
- `full_determinism`: False |
|
- `torchdynamo`: None |
|
- `ray_scope`: last |
|
- `ddp_timeout`: 1800 |
|
- `torch_compile`: False |
|
- `torch_compile_backend`: None |
|
- `torch_compile_mode`: None |
|
- `dispatch_batches`: None |
|
- `split_batches`: None |
|
- `include_tokens_per_second`: False |
|
- `include_num_input_tokens_seen`: False |
|
- `neftune_noise_alpha`: None |
|
- `optim_target_modules`: None |
|
- `batch_eval_metrics`: False |
|
- `eval_on_start`: False |
|
- `use_liger_kernel`: False |
|
- `eval_use_gather_object`: False |
|
- `average_tokens_across_devices`: False |
|
- `prompts`: None |
|
- `batch_sampler`: batch_sampler |
|
- `multi_dataset_batch_sampler`: round_robin |
|
|
|
</details> |
|
|
|
### Training Logs |
|
| Epoch | Step | Training Loss | cosine_ndcg@10 | |
|
|:-----:|:----:|:-------------:|:--------------:| |
|
| 0.625 | 50 | - | 0.9091 | |
|
| 1.0 | 80 | - | 0.9209 | |
|
| 1.25 | 100 | - | 0.9329 | |
|
| 1.875 | 150 | - | 0.9439 | |
|
| 2.0 | 160 | - | 0.9379 | |
|
| 2.5 | 200 | - | 0.9367 | |
|
| 3.0 | 240 | - | 0.9459 | |
|
| 3.125 | 250 | - | 0.9432 | |
|
| 3.75 | 300 | - | 0.9479 | |
|
| 4.0 | 320 | - | 0.9515 | |
|
| 4.375 | 350 | - | 0.9509 | |
|
| 5.0 | 400 | - | 0.9581 | |
|
| 5.625 | 450 | - | 0.9551 | |
|
| 6.0 | 480 | - | 0.9604 | |
|
| 6.25 | 500 | 0.3078 | 0.9577 | |
|
| 6.875 | 550 | - | 0.9651 | |
|
| 7.0 | 560 | - | 0.9651 | |
|
| 7.5 | 600 | - | 0.9641 | |
|
| 8.0 | 640 | - | 0.9641 | |
|
| 8.125 | 650 | - | 0.9638 | |
|
| 8.75 | 700 | - | 0.9638 | |
|
| 9.0 | 720 | - | 0.9638 | |
|
| 9.375 | 750 | - | 0.9601 | |
|
| 10.0 | 800 | - | 0.9638 | |
|
|
|
|
|
### Framework Versions |
|
- Python: 3.11.11 |
|
- Sentence Transformers: 3.4.1 |
|
- Transformers: 4.48.3 |
|
- PyTorch: 2.5.1+cu124 |
|
- Accelerate: 1.3.0 |
|
- Datasets: 3.3.2 |
|
- Tokenizers: 0.21.0 |
|
|
|
## Citation |
|
|
|
### BibTeX |
|
|
|
#### Sentence Transformers |
|
```bibtex |
|
@inproceedings{reimers-2019-sentence-bert, |
|
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", |
|
author = "Reimers, Nils and Gurevych, Iryna", |
|
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", |
|
month = "11", |
|
year = "2019", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://arxiv.org/abs/1908.10084", |
|
} |
|
``` |
|
|
|
#### MatryoshkaLoss |
|
```bibtex |
|
@misc{kusupati2024matryoshka, |
|
title={Matryoshka Representation Learning}, |
|
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi}, |
|
year={2024}, |
|
eprint={2205.13147}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.LG} |
|
} |
|
``` |
|
|
|
#### MultipleNegativesRankingLoss |
|
```bibtex |
|
@misc{henderson2017efficient, |
|
title={Efficient Natural Language Response Suggestion for Smart Reply}, |
|
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, |
|
year={2017}, |
|
eprint={1705.00652}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |
|
|
|
<!-- |
|
## Glossary |
|
|
|
*Clearly define terms in order to be accessible across audiences.* |
|
--> |
|
|
|
<!-- |
|
## Model Card Authors |
|
|
|
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.* |
|
--> |
|
|
|
<!-- |
|
## Model Card Contact |
|
|
|
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.* |
|
--> |