metadata

license: apache-2.0
datasets:
  - emrecan/all-nli-tr
language:
  - tr
  - en
metrics:
  - spearmanr
  - accuracy
  - bertscore
base_model:
  - nomic-ai/nomic-embed-text-v2-moe
pipeline_tag: zero-shot-classification
library_name: sentence-transformers

Model Card: Turkish Triplet Embedding Model (Nomic MoE)

Model Description

This is an embedding model trained on a Turkish triplet corpus, utilizing the dataset emrecan/all-nli-tr. The model is based on Nomic Mixture of Experts (MoE) and achieves strong performance compared to other existing Turkish embedding models.

Intended Use

Semantic similarity tasks
Text clustering
Information retrieval
Sentence and document-level embedding generation

Training Details

Architecture: Nomic Mixture of Experts (MoE)
Dataset: emrecan/all-nli-tr
Training Objective: Triplet loss for contrastive learning

Evaluation & Performance

Compared to other Turkish embedding models, this model demonstrates strong performance in capturing semantic relationships within the language. Further evaluations and benchmarks will be shared as they become available.

How to Use

You can use this model with Hugging Face's transformers or sentence-transformers library:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("your-huggingface-model-name")
embeddings = model.encode(["Merhaba dünya!", "Bugün hava çok güzel."])
print(embeddings)

License & Citation

Please refer to the repository for licensing details and citation instructions.