Trent commited on
Commit
113ad6b
·
1 Parent(s): 6a49bc1

Contributions

Browse files
Files changed (1) hide show
  1. app.py +6 -1
app.py CHANGED
@@ -18,10 +18,15 @@ Hi! This is the demo for the [flax sentence embeddings](https://huggingface.co/f
18
  We trained three general-purpose flax-sentence-embeddings models: a **distilroberta base**, a **mpnet base** and a **minilm-l6**.
19
  The models were trained on a dataset comprising of [1 Billion+ training corpus](https://huggingface.co/flax-sentence-embeddings/all_datasets_v4_MiniLM-L6#training-data) with the v3 setup.
20
 
21
- In addition, we trained [20 models](https://huggingface.co/flax-sentence-embeddings) focused on general-purpose, QuestionAnswering and Code search.
22
  We also uploaded [8 datasets](https://huggingface.co/flax-sentence-embeddings) specialized for Question Answering, Sentence-Similiarity and Gender Evaluation.
23
  You can view our models and datasets [here](https://huggingface.co/flax-sentence-embeddings).
24
 
 
 
 
 
 
25
  ## Contributions
26
 
27
  - 20 performant Sentence Embedding models that can be utilized for Sentence Simliarity / Asymmetric QA / Search & Clustering.
 
18
  We trained three general-purpose flax-sentence-embeddings models: a **distilroberta base**, a **mpnet base** and a **minilm-l6**.
19
  The models were trained on a dataset comprising of [1 Billion+ training corpus](https://huggingface.co/flax-sentence-embeddings/all_datasets_v4_MiniLM-L6#training-data) with the v3 setup.
20
 
21
+ In addition, we trained [20 models](https://huggingface.co/flax-sentence-embeddings) focused on general-purpose, QuestionAnswering and Code search and achieved SOTA on multiple benchmarks.
22
  We also uploaded [8 datasets](https://huggingface.co/flax-sentence-embeddings) specialized for Question Answering, Sentence-Similiarity and Gender Evaluation.
23
  You can view our models and datasets [here](https://huggingface.co/flax-sentence-embeddings).
24
 
25
+ | Model | [FullEvaluation](https://docs.google.com/spreadsheets/d/1vXJrIg38cEaKjOG5y4I4PQwAQFUmCkohbViJ9zj_Emg/edit#gid=1809754143) Average | 20Newsgroups Clustering | StackOverflow DupQuestions | Twitter SemEval2015 |
26
+ |-----------|---------------------------------------|-------|-------|-------|
27
+ | paraphrase-mpnet-base-v2 (previous SOTA) | 67.97 | 47.79 | 49.03 | 72.36 |
28
+ | all_datasets_v3_roberta-large (400k steps) | **70.22** | 50.12 | 52.18 | 75.28 |
29
+
30
  ## Contributions
31
 
32
  - 20 performant Sentence Embedding models that can be utilized for Sentence Simliarity / Asymmetric QA / Search & Clustering.