Trent commited on
Commit
6b6585e
·
1 Parent(s): e01d8a9

Update demo

Browse files
Files changed (1) hide show
  1. app.py +15 -9
app.py CHANGED
@@ -15,8 +15,9 @@ menu = st.sidebar.radio("", options=["Sentence Similarity", "Asymmetric QA", "Se
15
  st.markdown('''
16
 
17
  Hi! This is the demo for the [flax sentence embeddings](https://huggingface.co/flax-sentence-embeddings) created for the **Flax/JAX community week 🤗**.
18
- We trained three general-purpose flax-sentence-embeddings models: a **distilroberta base**, a **mpnet base** and a **minilm-l6**.
19
- The models were trained on a dataset comprising of [1 Billion+ training corpus](https://huggingface.co/flax-sentence-embeddings/all_datasets_v4_MiniLM-L6#training-data) with the v3 setup.
 
20
 
21
  In addition, we trained [20 models](https://huggingface.co/flax-sentence-embeddings) focused on general-purpose, QuestionAnswering and Code search and **achieved SOTA on multiple benchmarks.**
22
  We also uploaded [8 datasets](https://huggingface.co/flax-sentence-embeddings) specialized for Question Answering, Sentence-Similiarity and Gender Evaluation.
@@ -41,7 +42,9 @@ You can view our models and datasets [here](https://huggingface.co/flax-sentence
41
  if menu == "Sentence Similarity":
42
  st.header('Sentence Similarity')
43
  st.markdown('''
44
- **Instructions**: You can compare the similarity of the main text with other texts of your choice. In the background, we'll create an embedding for each text, and then we'll use the cosine similarity function to calculate a similarity metric between our main sentence and the others.
 
 
45
 
46
  For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
47
  ''')
@@ -79,8 +82,11 @@ For more cool information on sentence embeddings, see the [sBert project](https:
79
  elif menu == "Asymmetric QA":
80
  st.header('Asymmetric QA')
81
  st.markdown('''
82
- **Instructions**: You can compare the Answer likeliness of a given Query with answer candidates of your choice. In the background, we'll create an embedding for each answer, and then we'll use the cosine similarity function to calculate a similarity metric between our query sentence and the others.
83
- `mpnet_asymmetric_qa` model works best for hard-negative answers or distinguishing similar queries due to separate models applied for encoding questions and answers.
 
 
 
84
 
85
  For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
86
  ''')
@@ -122,7 +128,7 @@ For more cool information on sentence embeddings, see the [sBert project](https:
122
  elif menu == "Search / Cluster":
123
  st.header('Search / Cluster')
124
  st.markdown('''
125
- **Instructions**: Make a query for anything related to "Python" and the model you choose will return you similar queries.
126
 
127
  For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
128
  ''')
@@ -136,14 +142,14 @@ For more cool information on sentence embeddings, see the [sBert project](https:
136
 
137
  n_texts = st.number_input(
138
  f'''How many similar queries you want?''',
139
- value=3,
140
  min_value=2)
141
 
142
  if st.button('Give me my search.'):
143
  results = {model: inference.text_search(anchor, n_texts, model, QA_MODELS_ID) for model in select_models}
144
  st.table(pd.DataFrame(results[select_models[0]]).T)
145
 
146
- if st.button('3D Clustering of search result using T-SNE on generated embeddings'):
147
  st.write("Currently only works at local due to Spaces / plotly integration.")
148
  st.write("Demonstration : https://gyazo.com/1ff0aa438ae533de3b3c63382af7fe80")
149
  # fig = inference.text_cluster(anchor, 1000, select_models[0], QA_MODELS_ID)
@@ -174,7 +180,7 @@ For more cool information on sentence embeddings, see the [sBert project](https:
174
  index = ["male", "female", "gender_bias"]
175
  df_total = pd.DataFrame(index=index)
176
  for key, value in results.items():
177
- softmax = [ts.item() for ts in torch.nn.functional.softmax(torch.from_numpy(value['score'].values))]
178
  if softmax[0] > softmax[1]:
179
  gender = "male"
180
  elif abs(softmax[0] - softmax[1]) < 1e-2:
 
15
  st.markdown('''
16
 
17
  Hi! This is the demo for the [flax sentence embeddings](https://huggingface.co/flax-sentence-embeddings) created for the **Flax/JAX community week 🤗**.
18
+ We trained three general-purpose flax-sentence-embeddings models: a distilroberta base, a mpnet base and a minilm-l6. They were
19
+ trained using **Siamese network** configuration. The models were trained on a dataset comprising of
20
+ [1 Billion+ training corpus](https://huggingface.co/flax-sentence-embeddings/all_datasets_v4_MiniLM-L6#training-data) with the v3 setup.
21
 
22
  In addition, we trained [20 models](https://huggingface.co/flax-sentence-embeddings) focused on general-purpose, QuestionAnswering and Code search and **achieved SOTA on multiple benchmarks.**
23
  We also uploaded [8 datasets](https://huggingface.co/flax-sentence-embeddings) specialized for Question Answering, Sentence-Similiarity and Gender Evaluation.
 
42
  if menu == "Sentence Similarity":
43
  st.header('Sentence Similarity')
44
  st.markdown('''
45
+ **Instructions**: You can compare the similarity of the main text with other texts of your choice. In the background,
46
+ we'll create an embedding for each text, and then we'll use the cosine similarity function to calculate a similarity
47
+ metric between our main sentence and the others.
48
 
49
  For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
50
  ''')
 
82
  elif menu == "Asymmetric QA":
83
  st.header('Asymmetric QA')
84
  st.markdown('''
85
+ **Instructions**: You can compare the Answer likeliness of a given Query with answer candidates of your choice. In the
86
+ background, we'll create an embedding for each answer, and then we'll use the cosine similarity function to calculate a
87
+ similarity metric between our query sentence and the others.
88
+ `mpnet_asymmetric_qa` model works best for hard-negative answers or distinguishing similar queries due to separate models
89
+ applied for encoding questions and answers.
90
 
91
  For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
92
  ''')
 
128
  elif menu == "Search / Cluster":
129
  st.header('Search / Cluster')
130
  st.markdown('''
131
+ **Instructions**: Make a query for anything related to "Python" and the model will return you nearby answers via dot-product.
132
 
133
  For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
134
  ''')
 
142
 
143
  n_texts = st.number_input(
144
  f'''How many similar queries you want?''',
145
+ value=5,
146
  min_value=2)
147
 
148
  if st.button('Give me my search.'):
149
  results = {model: inference.text_search(anchor, n_texts, model, QA_MODELS_ID) for model in select_models}
150
  st.table(pd.DataFrame(results[select_models[0]]).T)
151
 
152
+ if st.button('3D Clustering of 1000 search results using T-SNE on generated embeddings'):
153
  st.write("Currently only works at local due to Spaces / plotly integration.")
154
  st.write("Demonstration : https://gyazo.com/1ff0aa438ae533de3b3c63382af7fe80")
155
  # fig = inference.text_cluster(anchor, 1000, select_models[0], QA_MODELS_ID)
 
180
  index = ["male", "female", "gender_bias"]
181
  df_total = pd.DataFrame(index=index)
182
  for key, value in results.items():
183
+ softmax = [round(ts.item(), 4) for ts in torch.nn.functional.softmax(torch.from_numpy(value['score'].values))]
184
  if softmax[0] > softmax[1]:
185
  gender = "male"
186
  elif abs(softmax[0] - softmax[1]) < 1e-2: