dptrsa commited on
Commit
4b7d31c
·
verified ·
1 Parent(s): 39c4bb0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -26
README.md CHANGED
@@ -7,40 +7,17 @@ tags:
7
  license: apache-2.0
8
  ---
9
 
10
- # {MODEL_NAME}
11
 
12
  Sentence Transformer for Assurance & Risk Question-Answering (STAR-QA) is a fine-tuned [sentence-transformers](https://www.SBERT.net) model based on ALL-MPNET-BASE-V2. It has been developed to produce **SOTA embeddings for audit, risk-management, compliance and associated regulatory documents**. The model maps sentence pairs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search as part of retrieval-augmented generation pipelines.
13
 
14
- <!--- Describe your model here -->
15
-
16
- ## Usage (Sentence-Transformers)
17
-
18
- Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
19
-
20
- ```
21
- pip install -U sentence-transformers
22
- ```
23
-
24
- Then you can use the model like this:
25
-
26
- ```python
27
- from sentence_transformers import SentenceTransformer
28
- sentences = ["This is an example sentence", "Each sentence is converted"]
29
-
30
- model = SentenceTransformer('{MODEL_NAME}')
31
- embeddings = model.encode(sentences)
32
- print(embeddings)
33
- ```
34
-
35
  ## Evaluation Results
36
 
37
  The model was evaluated on a held-out sample from the STAR-QA dataset (see below) using `sentence-transformers.InformationRetrievalEvaluator`. Reported metrics include P/R @ 3 candidates, as well as MRR @ 10, MAP @ 10 and NDCG @ 100. This fine-tuned model was also benchmarked against its base model using the same methodology.
38
 
39
  ## Training Data
40
 
41
- The model was fine-tuned from a corpus of audit, risk-management, compliance and associated regulatory documents sourced from the public internet. Documents were cleaned and chunked into 2-sentence blocks. Each block was then sent to a state-of-the-art LLM with the following prompt:
42
-
43
- "Write a question about {document_topic} for which this is the answer: {block}"
44
 
45
  The resulting question and its associated ground-truth answer (collectively a "pair") constitute a single training example for the fine-tuning step.
46
 
@@ -90,4 +67,10 @@ SentenceTransformer(
90
 
91
  ## Citing & Authors
92
 
93
- @misc{Theron_2024, title={Sentence Transformer for Assurance &#38; Risk Question-Answering (STAR-QA)}, url={https://huggingface.co/dptrsa/STAR-QA}, author={Theron, Daniel}, year={2024}, month={Feb} }
 
 
 
 
 
 
 
7
  license: apache-2.0
8
  ---
9
 
10
+ # Sentence Transformer for Assurance & Risk Question-Answering (STAR-QA)
11
 
12
  Sentence Transformer for Assurance & Risk Question-Answering (STAR-QA) is a fine-tuned [sentence-transformers](https://www.SBERT.net) model based on ALL-MPNET-BASE-V2. It has been developed to produce **SOTA embeddings for audit, risk-management, compliance and associated regulatory documents**. The model maps sentence pairs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search as part of retrieval-augmented generation pipelines.
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ## Evaluation Results
15
 
16
  The model was evaluated on a held-out sample from the STAR-QA dataset (see below) using `sentence-transformers.InformationRetrievalEvaluator`. Reported metrics include P/R @ 3 candidates, as well as MRR @ 10, MAP @ 10 and NDCG @ 100. This fine-tuned model was also benchmarked against its base model using the same methodology.
17
 
18
  ## Training Data
19
 
20
+ The model was fine-tuned from a corpus of audit, risk-management, compliance and associated regulatory documents sourced from the public internet. Documents were cleaned and chunked into 2-sentence blocks. Each block was then sent to a state-of-the-art LLM with the following prompt: "Write a question about {document_topic} for which this is the answer: {block}"
 
 
21
 
22
  The resulting question and its associated ground-truth answer (collectively a "pair") constitute a single training example for the fine-tuning step.
23
 
 
67
 
68
  ## Citing & Authors
69
 
70
+ @misc{Theron_2024,
71
+ title={Sentence Transformer for Assurance &#38; Risk Question-Answering (STAR-QA)},
72
+ url={https://huggingface.co/dptrsa/STAR-QA},
73
+ author={Theron, Daniel},
74
+ year={2024},
75
+ month={Feb}
76
+ }