PhoRanker / README.md
itdainb's picture
Update README.md
7d29d04 verified
|
raw
history blame
1.42 kB
metadata
license: apache-2.0
datasets:
  - unicamp-dl/mmarco
language:
  - vi
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - cross-encoder
  - rerank

Installation

  • Install sentence-transformers (recommend):

    • pip install sentence-transformers
  • Install transformers (optional):

    • pip install transformers
  • Install pyvi to word segment:

    • pip install pyvi

Usage with transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained('itdainb/vietnamese-cross-encoder')
tokenizer = AutoTokenizer.from_pretrained('itdainb/vietnamese-cross-encoder')

features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'],  padding=True, truncation=True, return_tensors="pt")

model.eval()
with torch.no_grad():
    scores = model(**features).logits
    print(scores)

Usage with sentence-transformers

Using this model becomes easy when you have sentence-transformers installed:

pip install -U sentence-transformers

Then you can use the model like this:

from sentence_transformers import CrossEncoder
model = CrossEncoder('itdainb/vietnamese-cross-encoder', max_length=256)
scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Query', 'Paragraph3')])