lengocduc195's picture
pushNe
2359bda

Datasets

sentence_transformers.datasets contains classes to organize your training input examples.

ParallelSentencesDataset

ParallelSentencesDataset is used for multilingual training. For details, see multilingual training.

.. autoclass:: sentence_transformers.datasets.ParallelSentencesDataset

SentenceLabelDataset

SentenceLabelDataset can be used if you have labeled sentences and want to train with triplet loss.

.. autoclass:: sentence_transformers.datasets.SentenceLabelDataset

DenoisingAutoEncoderDataset

DenoisingAutoEncoderDataset is used for unsupervised training with the TSDAE method.

.. autoclass:: sentence_transformers.datasets.DenoisingAutoEncoderDataset

NoDuplicatesDataLoader

NoDuplicatesDataLoadercan be used together with MultipleNegativeRankingLoss to ensure that no duplicates are within the same batch.

.. autoclass:: sentence_transformers.datasets.NoDuplicatesDataLoader