--- license: apache-2.0 datasets: - unicamp-dl/mmarco language: - vi library_name: sentence-transformers pipeline_tag: sentence-similarity tags: - cross-encoder - rerank --- ## Installation - Install `sentence-transformers` (recommend): - `pip install sentence-transformers` - Install `transformers` (optional): - `pip install transformers` - Install `pyvi` to word segment: - `pip install pyvi` ## Usage with transformers ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch model = AutoModelForSequenceClassification.from_pretrained('itdainb/vietnamese-cross-encoder') tokenizer = AutoTokenizer.from_pretrained('itdainb/vietnamese-cross-encoder') features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], padding=True, truncation=True, return_tensors="pt") model.eval() with torch.no_grad(): scores = model(**features).logits print(scores) ``` ## Usage with sentence-transformers Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed: ``` pip install -U sentence-transformers ``` Then you can use the model like this: ```python from sentence_transformers import CrossEncoder model = CrossEncoder('itdainb/vietnamese-cross-encoder', max_length=256) scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Query', 'Paragraph3')]) ```