metadata
language:
- en
- es
license: apache-2.0
tags:
- sentence-transformers
- cross-encoder
- generated_from_trainer
- dataset_size:578402
- loss:BinaryCrossEntropyLoss
base_model: EuroBERT/EuroBERT-210m
pipeline_tag: text-ranking
library_name: sentence-transformers
metrics:
- map
- mrr@10
- ndcg@10
model-index:
- name: EuroBERT-210m trained on GooAQ
results:
- task:
type: cross-encoder-reranking
name: Cross Encoder Reranking
dataset:
name: gooaq dev
type: gooaq-dev
metrics:
- type: map
value: 0.7097
name: Map
- type: mrr@10
value: 0.7089
name: Mrr@10
- type: ndcg@10
value: 0.7579
name: Ndcg@10
- task:
type: cross-encoder-reranking
name: Cross Encoder Reranking
dataset:
name: NanoMSMARCO R100
type: NanoMSMARCO_R100
metrics:
- type: map
value: 0.463
name: Map
- type: mrr@10
value: 0.4452
name: Mrr@10
- type: ndcg@10
value: 0.5106
name: Ndcg@10
- task:
type: cross-encoder-reranking
name: Cross Encoder Reranking
dataset:
name: NanoNFCorpus R100
type: NanoNFCorpus_R100
metrics:
- type: map
value: 0.3363
name: Map
- type: mrr@10
value: 0.5204
name: Mrr@10
- type: ndcg@10
value: 0.3632
name: Ndcg@10
- task:
type: cross-encoder-reranking
name: Cross Encoder Reranking
dataset:
name: NanoNQ R100
type: NanoNQ_R100
metrics:
- type: map
value: 0.4738
name: Map
- type: mrr@10
value: 0.4783
name: Mrr@10
- type: ndcg@10
value: 0.5182
name: Ndcg@10
- task:
type: cross-encoder-nano-beir
name: Cross Encoder Nano BEIR
dataset:
name: NanoBEIR R100 mean
type: NanoBEIR_R100_mean
metrics:
- type: map
value: 0.4244
name: Map
- type: mrr@10
value: 0.4813
name: Mrr@10
- type: ndcg@10
value: 0.464
name: Ndcg@10
datasets:
- sentence-transformers/gooaq
Fine-Tuned Model
fjmgAI/rerank1-210M-EuroBERT
Base Model
EuroBERT/EuroBERT-210m
Fine-Tuning Method
This is a Cross Encoder model finetuned from EuroBERT/EuroBERT-210m using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Dataset
Description
This dataset is a collection of question-answer pairs, collected from Google.
Fine-Tuning Details
- The model was trained using 578,402 training samples from sentence-transformer.
Cross Encoder Reranking
- Dataset:
gooaq-dev
- Evaluated with
CrossEncoderRerankingEvaluator
with these parameters:{ "at_k": 10, "always_rerank_positives": false }
Metric | Value |
---|---|
map | 0.7097 (+0.1786) |
mrr@10 | 0.7089 (+0.1850) |
ndcg@10 | 0.7579 (+0.1667) |
Cross Encoder Reranking
- Datasets:
NanoMSMARCO_R100
,NanoNFCorpus_R100
andNanoNQ_R100
- Evaluated with
CrossEncoderRerankingEvaluator
with these parameters:{ "at_k": 10, "always_rerank_positives": true }
Metric | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
---|---|---|---|
map | 0.4630 (-0.0266) | 0.3363 (+0.0753) | 0.4738 (+0.0542) |
mrr@10 | 0.4452 (-0.0323) | 0.5204 (+0.0206) | 0.4783 (+0.0516) |
ndcg@10 | 0.5106 (-0.0298) | 0.3632 (+0.0381) | 0.5182 (+0.0176) |
Cross Encoder Nano BEIR
- Dataset:
NanoBEIR_R100_mean
- Evaluated with
CrossEncoderNanoBEIREvaluator
with these parameters:{ "dataset_names": [ "msmarco", "nfcorpus", "nq" ], "rerank_k": 100, "at_k": 10, "always_rerank_positives": true }
Metric | Value |
---|---|
map | 0.4244 (+0.0343) |
mrr@10 | 0.4813 (+0.0133) |
ndcg@10 | 0.4640 (+0.0086) |
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("fjmgAI/rerank1-210M-EuroBERT", trust_remote_code=True)
# Get scores for pairs of texts
pairs = [
['what are the risks with taking statins?', "['Muscle pain and damage. One of the most common complaints of people taking statins is muscle pain. ... ', 'Liver damage. Occasionally, statin use could cause an increase in the level of enzymes that signal liver inflammation. ... ', 'Increased blood sugar or type 2 diabetes. ... ', 'Neurological side effects.']"],
['what are the risks with taking statins?', 'Doctors discovered that statins can help lower blood pressure, as well as lower cholesterol. Statins are often prescribed to people with high cholesterol. Too much cholesterol in your blood increases your risk of heart attacks and strokes.'],
['what are the risks with taking statins?', 'Lipitor and Crestor are both effective statins that lower levels of “bad” cholesterol and increase levels of “good” cholesterol. While Crestor is the more potent statin, both medications are effective and have slightly different side effects and drug interactions.'],
['what are the risks with taking statins?', "About simvastatin Simvastatin belongs to a group of medicines called statins. It's used to lower cholesterol if you've been diagnosed with high blood cholesterol. It's also taken to prevent heart disease, including heart attacks and strokes."],
['what are the risks with taking statins?', 'Zetia works to lower cholesterol in a new way different from the statins: it inhibits the absorption of cholesterol in the small intestine, whereas the statins work by blocking cholesterol production in the liver.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'what are the risks with taking statins?',
[
"['Muscle pain and damage. One of the most common complaints of people taking statins is muscle pain. ... ', 'Liver damage. Occasionally, statin use could cause an increase in the level of enzymes that signal liver inflammation. ... ', 'Increased blood sugar or type 2 diabetes. ... ', 'Neurological side effects.']",
'Doctors discovered that statins can help lower blood pressure, as well as lower cholesterol. Statins are often prescribed to people with high cholesterol. Too much cholesterol in your blood increases your risk of heart attacks and strokes.',
'Lipitor and Crestor are both effective statins that lower levels of “bad” cholesterol and increase levels of “good” cholesterol. While Crestor is the more potent statin, both medications are effective and have slightly different side effects and drug interactions.',
"About simvastatin Simvastatin belongs to a group of medicines called statins. It's used to lower cholesterol if you've been diagnosed with high blood cholesterol. It's also taken to prevent heart disease, including heart attacks and strokes.",
'Zetia works to lower cholesterol in a new way different from the statins: it inhibits the absorption of cholesterol in the small intestine, whereas the statins work by blocking cholesterol production in the liver.',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Framework Versions
- Python: 3.11.12
- Sentence Transformers: 4.0.2
- Transformers: 4.51.2
- PyTorch: 2.6.0+cu126
- Accelerate: 1.6.0
- Datasets: 3.5.0
- Tokenizers: 0.21.1
Purpose
This tuned reranker model is optimized for Spanish and English applications, prioritizing accurate reordering of results by leveraging semantic similarity through refined embedding comparisons, ideal for enhancing question-answering and document retrieval tasks.
- Developed by: fjmgAI
- License: apache-2.0