|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- unicamp-dl/mmarco |
|
language: |
|
- vi |
|
library_name: sentence-transformers |
|
pipeline_tag: sentence-similarity |
|
tags: |
|
- cross-encoder |
|
- rerank |
|
--- |
|
|
|
## Installation |
|
- Install `sentence-transformers` (recommend): |
|
|
|
- `pip install sentence-transformers` |
|
|
|
- Install `transformers` (optional): |
|
|
|
- `pip install transformers` |
|
|
|
- Install `pyvi` to word segment: |
|
|
|
- `pip install pyvi` |
|
|
|
|
|
## Usage with transformers |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
import torch |
|
|
|
model = AutoModelForSequenceClassification.from_pretrained('itdainb/vietnamese-cross-encoder') |
|
tokenizer = AutoTokenizer.from_pretrained('itdainb/vietnamese-cross-encoder') |
|
|
|
features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], padding=True, truncation=True, return_tensors="pt") |
|
|
|
model.eval() |
|
with torch.no_grad(): |
|
scores = model(**features).logits |
|
print(scores) |
|
``` |
|
|
|
## Usage with sentence-transformers |
|
|
|
Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed: |
|
|
|
``` |
|
pip install -U sentence-transformers |
|
``` |
|
|
|
Then you can use the model like this: |
|
|
|
```python |
|
from sentence_transformers import CrossEncoder |
|
model = CrossEncoder('itdainb/vietnamese-cross-encoder', max_length=256) |
|
scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Query', 'Paragraph3')]) |
|
``` |