finetuned_arctic_ft / README.md
deepali1021's picture
Add new SentenceTransformer model
f799a57 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:20
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-l
widget:
  - source_sentence: >-
      What initiatives were implemented in the past year to improve
      communication between departments?
    sentences:
      - >-
        with other departments. In the past year, we conducted monthly
        departmental meetings and 

        established communication channels to facilitate information sharing and
        problem-solving. 
         
        Fare Collection and Fee Structure
      - >-
        Our fare collection system ensures fair and consistent fee collection
        from passengers. The current fee 

        structure is as follows: 
         
        Regular fare: $2.50 

        Senior citizens and students: $1.50 

        Children under 5 years old: Free 

        Fee collection is primarily done through electronic payment methods,
        such as smart cards and 

        mobile payment apps. Drivers are responsible for ensuring correct fare
        collection and providing 

        receipts upon request. 

        Route Information and Rules 

        Our transportation department operates multiple routes within the city.
        Route information, including 

        maps, schedules, and stops, is available on our website and at
        designated information centers.
      - >-
        Our drivers are responsible for operating vehicles safely, following
        traffic rules and regulations. They 

        are required to hold a valid driver's license and maintain a clean
        driving record. In the past year, our 

        drivers completed over 2,000 hours of driving training to enhance their
        skills and knowledge. 
         
        Route Planning and Optimization 

        Efficient route planning is essential for timely transportation
        services. Our department utilizes 

        advanced routing software to optimize routes and minimize travel time.
        In the past year, we reduced 

        our average route duration by 15% through effective route planning and
        optimization strategies. 
         
        Customer Service
  - source_sentence: >-
      What is the primary focus of the Transportation Department as outlined in
      the manual?
    sentences:
      - >-
        for familiarizing themselves with the latest version of the manual. 
         
        Conclusion 

        Thank you for reviewing the Transportation Department Policy Manual.
        Your commitment to safety, 

        customer service, and compliance plays a crucial role in our
        department's success. If you have any 

        questions or need further information, please reach out to your
        supervisor or the department 

        manager. Your dedication and professionalism are appreciated.
      - >-
        department. It provides guidelines to ensure safe, efficient, and
        customer-focused transportation 

        services. Please read this manual carefully and consult with your
        supervisor or the department 

        manager if you have any questions or need further clarification. 
         
        Department Overview 

        The Transportation Department plays a critical role in providing
        reliable transportation services to 

        our customers. Our department consists of 50 drivers, 10 dispatchers,
        and 5 maintenance 

        technicians. In the past year, we transported over 500,000 passengers
        across various routes, ensuring 

        their safety and satisfaction. 
         
        Safety and Vehicle Maintenance 

        Safety is our top priority. All vehicles undergo regular inspections and
        maintenance to ensure they
      - >-
        department. It provides guidelines to ensure safe, efficient, and
        customer-focused transportation 

        services. Please read this manual carefully and consult with your
        supervisor or the department 

        manager if you have any questions or need further clarification. 
         
        Department Overview 

        The Transportation Department plays a critical role in providing
        reliable transportation services to 

        our customers. Our department consists of 50 drivers, 10 dispatchers,
        and 5 maintenance 

        technicians. In the past year, we transported over 500,000 passengers
        across various routes, ensuring 

        their safety and satisfaction. 
         
        Safety and Vehicle Maintenance 

        Safety is our top priority. All vehicles undergo regular inspections and
        maintenance to ensure they
  - source_sentence: >-
      How often were departmental meetings conducted to address information
      sharing and problem-solving?
    sentences:
      - >-
        with other departments. In the past year, we conducted monthly
        departmental meetings and 

        established communication channels to facilitate information sharing and
        problem-solving. 
         
        Fare Collection and Fee Structure
      - >-
        Compliance with local, state, and federal regulations is crucial. Our
        drivers are required to maintain 

        up-to-date knowledge of transportation laws and regulations. In the past
        year, we conducted 20 

        compliance audits to ensure adherence to regulatory requirements. 
         
        Training and Development 

        Continuous training and development are vital for our department's
        success. In the past year, our 

        drivers completed over 100 hours of professional development training,
        focusing on defensive 

        driving, customer service, and emergency preparedness. 
         
        Communication and Collaboration 

        Effective communication and collaboration are essential within the
        Transportation Department and
      - >-
        are in optimal condition. In the past year, we conducted 500 vehicle
        inspections, identifying and 

        addressing any maintenance issues promptly. Our drivers are required to
        conduct pre-trip and post-

        trip inspections to ensure the safety of the vehicles and passengers. 
         
        Driver Responsibilities
  - source_sentence: >-
      How can passengers access route information and schedules for the
      transportation department?
    sentences:
      - >-
        Our fare collection system ensures fair and consistent fee collection
        from passengers. The current fee 

        structure is as follows: 
         
        Regular fare: $2.50 

        Senior citizens and students: $1.50 

        Children under 5 years old: Free 

        Fee collection is primarily done through electronic payment methods,
        such as smart cards and 

        mobile payment apps. Drivers are responsible for ensuring correct fare
        collection and providing 

        receipts upon request. 

        Route Information and Rules 

        Our transportation department operates multiple routes within the city.
        Route information, including 

        maps, schedules, and stops, is available on our website and at
        designated information centers.
      - >-
        Passengers are expected to follow the rules and regulations while
        utilizing our transportation 

        services, including: 
         
        Boarding and exiting the vehicle in an orderly manner. 

        Yielding seats to elderly, disabled, and pregnant passengers. 

        Keeping noise levels to a minimum. 

        Refraining from eating, drinking, or smoking onboard. 

        Using designated safety equipment, such as seat belts, if available. 

        Reporting any suspicious activity or unattended items to the driver. 

        Amendments to the Policy Manual 

        This policy manual is subject to periodic review and amendments. Any
        updates or changes will be 

        communicated to employees through email or departmental meetings.
        Employees are responsible
      - >-
        Passengers are expected to follow the rules and regulations while
        utilizing our transportation 

        services, including: 
         
        Boarding and exiting the vehicle in an orderly manner. 

        Yielding seats to elderly, disabled, and pregnant passengers. 

        Keeping noise levels to a minimum. 

        Refraining from eating, drinking, or smoking onboard. 

        Using designated safety equipment, such as seat belts, if available. 

        Reporting any suspicious activity or unattended items to the driver. 

        Amendments to the Policy Manual 

        This policy manual is subject to periodic review and amendments. Any
        updates or changes will be 

        communicated to employees through email or departmental meetings.
        Employees are responsible
  - source_sentence: >-
      Who should you contact if you have questions or need further information
      regarding the Transportation Department Policy Manual?
    sentences:
      - >-
        Transportation Department Policy Manual 
         
        Table of Contents: 
         


        Introduction 

         

        Department Overview 

         

        Safety and Vehicle Maintenance 

         

        Driver Responsibilities 

         

        Route Planning and Optimization 

         

        Customer Service 

         

        Incident Reporting and Investigation 

         

        Compliance with Regulations 

         

        Training and Development 

         

        Communication and Collaboration 

         

        Fare Collection and Fee Structure 

         

        Route Information and Rules 

         

        Amendments to the Policy Manual 

         

        Conclusion 

        Introduction 

        Welcome to the Transportation Department Policy Manual! This manual
        serves as a comprehensive 

        guide to the policies, procedures, and expectations for employees
        working in the transportation
      - >-
        Compliance with local, state, and federal regulations is crucial. Our
        drivers are required to maintain 

        up-to-date knowledge of transportation laws and regulations. In the past
        year, we conducted 20 

        compliance audits to ensure adherence to regulatory requirements. 
         
        Training and Development 

        Continuous training and development are vital for our department's
        success. In the past year, our 

        drivers completed over 100 hours of professional development training,
        focusing on defensive 

        driving, customer service, and emergency preparedness. 
         
        Communication and Collaboration 

        Effective communication and collaboration are essential within the
        Transportation Department and
      - >-
        for familiarizing themselves with the latest version of the manual. 
         
        Conclusion 

        Thank you for reviewing the Transportation Department Policy Manual.
        Your commitment to safety, 

        customer service, and compliance plays a crucial role in our
        department's success. If you have any 

        questions or need further information, please reach out to your
        supervisor or the department 

        manager. Your dedication and professionalism are appreciated.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.9375
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9791666666666666
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9375
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.32638888888888884
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19999999999999998
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.9375
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.9791666666666666
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.971848173216197
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9625
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9625
            name: Cosine Map@100

SentenceTransformer based on Snowflake/snowflake-arctic-embed-l

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-l
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("deepali1021/finetuned_arctic_ft")
# Run inference
sentences = [
    'Who should you contact if you have questions or need further information regarding the Transportation Department Policy Manual?',
    "for familiarizing themselves with the latest version of the manual. \n \nConclusion \nThank you for reviewing the Transportation Department Policy Manual. Your commitment to safety, \ncustomer service, and compliance plays a crucial role in our department's success. If you have any \nquestions or need further information, please reach out to your supervisor or the department \nmanager. Your dedication and professionalism are appreciated.",
    'Transportation Department Policy Manual \n \nTable of Contents: \n \n• \nIntroduction \n• \nDepartment Overview \n• \nSafety and Vehicle Maintenance \n• \nDriver Responsibilities \n• \nRoute Planning and Optimization \n• \nCustomer Service \n• \nIncident Reporting and Investigation \n• \nCompliance with Regulations \n• \nTraining and Development \n• \nCommunication and Collaboration \n• \nFare Collection and Fee Structure \n• \nRoute Information and Rules \n• \nAmendments to the Policy Manual \n• \nConclusion \nIntroduction \nWelcome to the Transportation Department Policy Manual! This manual serves as a comprehensive \nguide to the policies, procedures, and expectations for employees working in the transportation',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.9375
cosine_accuracy@3 0.9792
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.9375
cosine_precision@3 0.3264
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.9375
cosine_recall@3 0.9792
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9718
cosine_mrr@10 0.9625
cosine_map@100 0.9625

Training Details

Training Dataset

Unnamed Dataset

  • Size: 20 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 20 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 12 tokens
    • mean: 16.3 tokens
    • max: 21 tokens
    • min: 34 tokens
    • mean: 95.1 tokens
    • max: 122 tokens
  • Samples:
    sentence_0 sentence_1
    What topics are covered in the Transportation Department Policy Manual? Transportation Department Policy Manual

    Table of Contents:


    Introduction

    Department Overview

    Safety and Vehicle Maintenance

    Driver Responsibilities

    Route Planning and Optimization

    Customer Service

    Incident Reporting and Investigation

    Compliance with Regulations

    Training and Development

    Communication and Collaboration

    Fare Collection and Fee Structure

    Route Information and Rules

    Amendments to the Policy Manual

    Conclusion
    Introduction
    Welcome to the Transportation Department Policy Manual! This manual serves as a comprehensive
    guide to the policies, procedures, and expectations for employees working in the transportation
    What is the purpose of the Transportation Department Policy Manual? Transportation Department Policy Manual

    Table of Contents:


    Introduction

    Department Overview

    Safety and Vehicle Maintenance

    Driver Responsibilities

    Route Planning and Optimization

    Customer Service

    Incident Reporting and Investigation

    Compliance with Regulations

    Training and Development

    Communication and Collaboration

    Fare Collection and Fee Structure

    Route Information and Rules

    Amendments to the Policy Manual

    Conclusion
    Introduction
    Welcome to the Transportation Department Policy Manual! This manual serves as a comprehensive
    guide to the policies, procedures, and expectations for employees working in the transportation
    What is the primary focus of the Transportation Department as outlined in the manual? department. It provides guidelines to ensure safe, efficient, and customer-focused transportation
    services. Please read this manual carefully and consult with your supervisor or the department
    manager if you have any questions or need further clarification.

    Department Overview
    The Transportation Department plays a critical role in providing reliable transportation services to
    our customers. Our department consists of 50 drivers, 10 dispatchers, and 5 maintenance
    technicians. In the past year, we transported over 500,000 passengers across various routes, ensuring
    their safety and satisfaction.

    Safety and Vehicle Maintenance
    Safety is our top priority. All vehicles undergo regular inspections and maintenance to ensure they
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step cosine_ndcg@10
1.0 2 0.8107
2.0 4 0.9292
3.0 6 0.9623
4.0 8 0.9712
5.0 10 0.9642
6.0 12 0.9642
7.0 14 0.9642
8.0 16 0.9642
9.0 18 0.9718
10.0 20 0.9718

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}