Spaces:

flax-sentence-embeddings
/

sentence-embeddings

Runtime error

App Files Files Community

Trent commited on Jul 26, 2021

Commit

6b6585e

1 Parent(s): e01d8a9

Update demo

Browse files

Files changed (1) hide show

app.py +15 -9

app.py CHANGED Viewed

@@ -15,8 +15,9 @@ menu = st.sidebar.radio("", options=["Sentence Similarity", "Asymmetric QA", "Se
 st.markdown('''
 Hi! This is the demo for the [flax sentence embeddings](https://huggingface.co/flax-sentence-embeddings) created for the **Flax/JAX community week 🤗**.
-We trained three general-purpose flax-sentence-embeddings models: a **distilroberta base**, a **mpnet base** and a **minilm-l6**.
-The models were trained on a dataset comprising of [1 Billion+ training corpus](https://huggingface.co/flax-sentence-embeddings/all_datasets_v4_MiniLM-L6#training-data) with the v3 setup.
 In addition, we trained [20 models](https://huggingface.co/flax-sentence-embeddings) focused on general-purpose, QuestionAnswering and Code search and **achieved SOTA on multiple benchmarks.**
 We also uploaded [8 datasets](https://huggingface.co/flax-sentence-embeddings) specialized for Question Answering, Sentence-Similiarity and Gender Evaluation.
@@ -41,7 +42,9 @@ You can view our models and datasets [here](https://huggingface.co/flax-sentence
 if menu == "Sentence Similarity":
     st.header('Sentence Similarity')
     st.markdown('''
-**Instructions**: You can compare the similarity of the main text with other texts of your choice. In the background, we'll create an embedding for each text, and then we'll use the cosine similarity function to calculate a similarity metric between our main sentence and the others.
 For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
 ''')
@@ -79,8 +82,11 @@ For more cool information on sentence embeddings, see the [sBert project](https:
 elif menu == "Asymmetric QA":
     st.header('Asymmetric QA')
     st.markdown('''
-**Instructions**: You can compare the Answer likeliness of a given Query with answer candidates of your choice. In the background, we'll create an embedding for each answer, and then we'll use the cosine similarity function to calculate a similarity metric between our query sentence and the others.
-`mpnet_asymmetric_qa` model works best for hard-negative answers or distinguishing similar queries due to separate models applied for encoding questions and answers.
 For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
 ''')
@@ -122,7 +128,7 @@ For more cool information on sentence embeddings, see the [sBert project](https:
 elif menu == "Search / Cluster":
     st.header('Search / Cluster')
     st.markdown('''
-**Instructions**: Make a query for anything related to "Python" and the model you choose will return you similar queries.
 For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
 ''')
@@ -136,14 +142,14 @@ For more cool information on sentence embeddings, see the [sBert project](https:
     n_texts = st.number_input(
         f'''How many similar queries you want?''',
-        value=3,
         min_value=2)
     if st.button('Give me my search.'):
         results = {model: inference.text_search(anchor, n_texts, model, QA_MODELS_ID) for model in select_models}
         st.table(pd.DataFrame(results[select_models[0]]).T)
-    if st.button('3D Clustering of search result using T-SNE on generated embeddings'):
         st.write("Currently only works at local due to Spaces / plotly integration.")
         st.write("Demonstration : https://gyazo.com/1ff0aa438ae533de3b3c63382af7fe80")
         # fig = inference.text_cluster(anchor, 1000, select_models[0], QA_MODELS_ID)
@@ -174,7 +180,7 @@ For more cool information on sentence embeddings, see the [sBert project](https:
         index = ["male", "female", "gender_bias"]
         df_total = pd.DataFrame(index=index)
         for key, value in results.items():
-            softmax = [ts.item() for ts in torch.nn.functional.softmax(torch.from_numpy(value['score'].values))]
             if softmax[0] > softmax[1]:
                 gender = "male"
             elif abs(softmax[0] - softmax[1]) < 1e-2:

 st.markdown('''
 Hi! This is the demo for the [flax sentence embeddings](https://huggingface.co/flax-sentence-embeddings) created for the **Flax/JAX community week 🤗**.
+We trained three general-purpose flax-sentence-embeddings models: a distilroberta base, a mpnet base and a minilm-l6. They were
+trained using **Siamese network** configuration. The models were trained on a dataset comprising of
+[1 Billion+ training corpus](https://huggingface.co/flax-sentence-embeddings/all_datasets_v4_MiniLM-L6#training-data) with the v3 setup.
 In addition, we trained [20 models](https://huggingface.co/flax-sentence-embeddings) focused on general-purpose, QuestionAnswering and Code search and **achieved SOTA on multiple benchmarks.**
 We also uploaded [8 datasets](https://huggingface.co/flax-sentence-embeddings) specialized for Question Answering, Sentence-Similiarity and Gender Evaluation.
 if menu == "Sentence Similarity":
     st.header('Sentence Similarity')
     st.markdown('''
+**Instructions**: You can compare the similarity of the main text with other texts of your choice. In the background,
+we'll create an embedding for each text, and then we'll use the cosine similarity function to calculate a similarity
+metric between our main sentence and the others.
 For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
 ''')
 elif menu == "Asymmetric QA":
     st.header('Asymmetric QA')
     st.markdown('''
+**Instructions**: You can compare the Answer likeliness of a given Query with answer candidates of your choice. In the
+background, we'll create an embedding for each answer, and then we'll use the cosine similarity function to calculate a
+similarity metric between our query sentence and the others.
+`mpnet_asymmetric_qa` model works best for hard-negative answers or distinguishing similar queries due to separate models
+applied for encoding questions and answers.
 For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
 ''')
 elif menu == "Search / Cluster":
     st.header('Search / Cluster')
     st.markdown('''
+**Instructions**: Make a query for anything related to "Python" and the model will return you nearby answers via dot-product.
 For more cool information on sentence embeddings, see the [sBert project](https://www.sbert.net/examples/applications/computing-embeddings/README.html).
 ''')
     n_texts = st.number_input(
         f'''How many similar queries you want?''',
+        value=5,
         min_value=2)
     if st.button('Give me my search.'):
         results = {model: inference.text_search(anchor, n_texts, model, QA_MODELS_ID) for model in select_models}
         st.table(pd.DataFrame(results[select_models[0]]).T)
+    if st.button('3D Clustering of 1000 search results using T-SNE on generated embeddings'):
         st.write("Currently only works at local due to Spaces / plotly integration.")
         st.write("Demonstration : https://gyazo.com/1ff0aa438ae533de3b3c63382af7fe80")
         # fig = inference.text_cluster(anchor, 1000, select_models[0], QA_MODELS_ID)
         index = ["male", "female", "gender_bias"]
         df_total = pd.DataFrame(index=index)
         for key, value in results.items():
+            softmax = [round(ts.item(), 4) for ts in torch.nn.functional.softmax(torch.from_numpy(value['score'].values))]
             if softmax[0] > softmax[1]:
                 gender = "male"
             elif abs(softmax[0] - softmax[1]) < 1e-2: