|
--- |
|
language: en |
|
license: apache-2.0 |
|
library_name: sentence-transformers |
|
tags: |
|
- sentence-transformers |
|
- feature-extraction |
|
- sentence-similarity |
|
- transformers |
|
pipeline_tag: sentence-similarity |
|
--- |
|
|
|
Distilbert encoder models trained on Movielens Ratings dataset (MovieLens-25M) using [DEXML](https://github.com/nilesh2797/DEXML) ([Dual Encoder for eXtreme Multi-Label classification, ICLR'24](https://arxiv.org/pdf/2310.10636v2.pdf)) method. |
|
|
|
## Inference Usage (Sentence-Transformers) |
|
With `sentence-transformers` installed you can use this model as following: |
|
```python |
|
from sentence_transformers import SentenceTransformer |
|
sentences = ["This is an example sentence", "Each sentence is converted"] |
|
model = SentenceTransformer('quicktensor/dexml_movielens-25m') |
|
embeddings = model.encode(sentences) |
|
print(embeddings) |
|
``` |
|
|
|
## Usage (HuggingFace Transformers) |
|
With huggingface transformers you only need to be a bit careful with how you pool the transformer output to get the embedding, you can use this model as following; |
|
```python |
|
from transformers import AutoTokenizer, AutoModel |
|
import torch |
|
import torch.nn.functional as F |
|
|
|
pooler = lambda x: F.normalize(x[:, 0, :], dim=-1) # Choose CLS token and normalize |
|
|
|
sentences = ["This is an example sentence", "Each sentence is converted"] |
|
tokenizer = AutoTokenizer.from_pretrained('quicktensor/dexml_movielens-25m') |
|
model = AutoModel.from_pretrained('quicktensor/dexml_movielens-25m') |
|
|
|
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt') |
|
with torch.no_grad(): |
|
embeddings = pooler(model(**encoded_input)) |
|
|
|
print(embeddings) |
|
``` |
|
|
|
## Cite |
|
If you found this model helpful, please cite our work as: |
|
```bib |
|
@InProceedings{DEXML, |
|
author = "Gupta, N. and Khatri, D. and Rawat, A-S. and Bhojanapalli, S. and Jain, P. and Dhillon, I.", |
|
title = "Dual-encoders for Extreme Multi-label Classification", |
|
booktitle = "International Conference on Learning Representations", |
|
month = "May", |
|
year = "2024" |
|
} |
|
``` |