Spaces:
Runtime error
Runtime error
Trent
commited on
Commit
·
113ad6b
1
Parent(s):
6a49bc1
Contributions
Browse files
app.py
CHANGED
@@ -18,10 +18,15 @@ Hi! This is the demo for the [flax sentence embeddings](https://huggingface.co/f
|
|
18 |
We trained three general-purpose flax-sentence-embeddings models: a **distilroberta base**, a **mpnet base** and a **minilm-l6**.
|
19 |
The models were trained on a dataset comprising of [1 Billion+ training corpus](https://huggingface.co/flax-sentence-embeddings/all_datasets_v4_MiniLM-L6#training-data) with the v3 setup.
|
20 |
|
21 |
-
In addition, we trained [20 models](https://huggingface.co/flax-sentence-embeddings) focused on general-purpose, QuestionAnswering and Code search.
|
22 |
We also uploaded [8 datasets](https://huggingface.co/flax-sentence-embeddings) specialized for Question Answering, Sentence-Similiarity and Gender Evaluation.
|
23 |
You can view our models and datasets [here](https://huggingface.co/flax-sentence-embeddings).
|
24 |
|
|
|
|
|
|
|
|
|
|
|
25 |
## Contributions
|
26 |
|
27 |
- 20 performant Sentence Embedding models that can be utilized for Sentence Simliarity / Asymmetric QA / Search & Clustering.
|
|
|
18 |
We trained three general-purpose flax-sentence-embeddings models: a **distilroberta base**, a **mpnet base** and a **minilm-l6**.
|
19 |
The models were trained on a dataset comprising of [1 Billion+ training corpus](https://huggingface.co/flax-sentence-embeddings/all_datasets_v4_MiniLM-L6#training-data) with the v3 setup.
|
20 |
|
21 |
+
In addition, we trained [20 models](https://huggingface.co/flax-sentence-embeddings) focused on general-purpose, QuestionAnswering and Code search and achieved SOTA on multiple benchmarks.
|
22 |
We also uploaded [8 datasets](https://huggingface.co/flax-sentence-embeddings) specialized for Question Answering, Sentence-Similiarity and Gender Evaluation.
|
23 |
You can view our models and datasets [here](https://huggingface.co/flax-sentence-embeddings).
|
24 |
|
25 |
+
| Model | [FullEvaluation](https://docs.google.com/spreadsheets/d/1vXJrIg38cEaKjOG5y4I4PQwAQFUmCkohbViJ9zj_Emg/edit#gid=1809754143) Average | 20Newsgroups Clustering | StackOverflow DupQuestions | Twitter SemEval2015 |
|
26 |
+
|-----------|---------------------------------------|-------|-------|-------|
|
27 |
+
| paraphrase-mpnet-base-v2 (previous SOTA) | 67.97 | 47.79 | 49.03 | 72.36 |
|
28 |
+
| all_datasets_v3_roberta-large (400k steps) | **70.22** | 50.12 | 52.18 | 75.28 |
|
29 |
+
|
30 |
## Contributions
|
31 |
|
32 |
- 20 performant Sentence Embedding models that can be utilized for Sentence Simliarity / Asymmetric QA / Search & Clustering.
|