metadata
license: apache-2.0
datasets:
- emrecan/all-nli-tr
language:
- tr
- en
metrics:
- spearmanr
- accuracy
- bertscore
base_model:
- nomic-ai/nomic-embed-text-v2-moe
pipeline_tag: zero-shot-classification
library_name: sentence-transformers
Model Card: Turkish Triplet Embedding Model (Nomic MoE)
Model Description
This is an embedding model trained on a Turkish triplet corpus, utilizing the dataset emrecan/all-nli-tr
. The model is based on Nomic Mixture of Experts (MoE) and achieves strong performance compared to other existing Turkish embedding models.
Intended Use
- Semantic similarity tasks
- Text clustering
- Information retrieval
- Sentence and document-level embedding generation
Training Details
- Architecture: Nomic Mixture of Experts (MoE)
- Dataset:
emrecan/all-nli-tr
- Training Objective: Triplet loss for contrastive learning
Evaluation & Performance
Compared to other Turkish embedding models, this model demonstrates strong performance in capturing semantic relationships within the language. Further evaluations and benchmarks will be shared as they become available.
How to Use
You can use this model with Hugging Face's transformers
or sentence-transformers
library:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("your-huggingface-model-name")
embeddings = model.encode(["Merhaba dünya!", "Bugün hava çok güzel."])
print(embeddings)
License & Citation
Please refer to the repository for licensing details and citation instructions.