|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- emrecan/all-nli-tr |
|
language: |
|
- tr |
|
- en |
|
metrics: |
|
- spearmanr |
|
- accuracy |
|
- bertscore |
|
base_model: |
|
- nomic-ai/nomic-embed-text-v2-moe |
|
pipeline_tag: zero-shot-classification |
|
library_name: sentence-transformers |
|
--- |
|
# Model Card: Turkish Triplet Embedding Model (Nomic MoE) |
|
|
|
## Model Description |
|
|
|
This is an embedding model trained on a Turkish triplet corpus, utilizing the dataset [`emrecan/all-nli-tr`](https://huggingface.co/datasets/emrecan/all-nli-tr). The model is based on **Nomic Mixture of Experts (MoE)** and achieves strong performance compared to other existing Turkish embedding models. |
|
|
|
### **Intended Use** |
|
- Semantic similarity tasks |
|
- Text clustering |
|
- Information retrieval |
|
- Sentence and document-level embedding generation |
|
|
|
### **Training Details** |
|
- **Architecture:** Nomic Mixture of Experts (MoE) |
|
- **Dataset:** `emrecan/all-nli-tr` |
|
- **Training Objective:** Triplet loss for contrastive learning |
|
|
|
### **Evaluation & Performance** |
|
Compared to other Turkish embedding models, this model demonstrates strong performance in capturing semantic relationships within the language. Further evaluations and benchmarks will be shared as they become available. |
|
|
|
### **How to Use** |
|
You can use this model with Hugging Face's `transformers` or `sentence-transformers` library: |
|
|
|
```python |
|
from sentence_transformers import SentenceTransformer |
|
|
|
model = SentenceTransformer("your-huggingface-model-name") |
|
embeddings = model.encode(["Merhaba dünya!", "Bugün hava çok güzel."]) |
|
print(embeddings) |
|
``` |
|
|
|
### **License & Citation** |
|
Please refer to the repository for licensing details and citation instructions. |