tformal commited on
Commit
5ce8d54
·
verified ·
1 Parent(s): ad62140

Upload README

Browse files
Files changed (1) hide show
  1. README.md +103 -3
README.md CHANGED
@@ -1,3 +1,103 @@
1
- ---
2
- license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - naver/trecdl22-crossencoder-debertav3
7
+ ---
8
+
9
+ # Model Card for Provence-reranker
10
+
11
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6273df31c3b822dad2d1eef2/7oc346UILM6rqDJLVzMTC.png" alt="image/png" width="800">
12
+
13
+ Provence is a lightweight **context pruning model** for retrieval-augmented generation, particularly **optimized for question answering**. Given a user question and a retrieved passage, Provence **removes sentences from the passage that are not relevant to the user question**. This **speeds up generation** and **reduces context noise**, in a plug-and-play manner **for any LLM**. More details about the model can be found in the [paper]() and in the [blogpost]().
14
+
15
+ *Developed by*: Naver Labs Europe
16
+ *License*: [CC BY-NC 4.0.](https://creativecommons.org/licenses/by-nc/4.0/)
17
+ * *Model*: `provence-reranker-debertav3-v1` (Provence for Pruning and Reranking Of retrieVEd relevaNt ContExt)
18
+ * *Backbone model*: [DeBERTav3-reranker](https://huggingface.co/naver/trecdl22-crossencoder-debertav3) (trained from [DeBERTa-v3-large](https://huggingface.co/microsoft/deberta-v3-large))
19
+ * *Model size*: 430 million parameters
20
+ * *Context length*: 512 tokens
21
+ * *Other model variants*: *TODO*
22
+
23
+ ## Usage
24
+
25
+ ```python
26
+ from transformers import AutoModel
27
+
28
+ provence = AutoModel.from_pretrained("naver/provence-reranker-debertav3-v1", trust_remote_code=True)
29
+
30
+ context = [["Shepherd’s pie. History. In early cookery books, the dish was a means of using leftover roasted meat of any kind, and the pie dish was lined on the sides and bottom with mashed potato, as well as having a mashed potato crust on top. Variations and similar dishes. Other potato-topped pies include: The modern ”Cumberland pie” is a version with either beef or lamb and a layer of bread- crumbs and cheese on top. In medieval times, and modern-day Cumbria, the pastry crust had a filling of meat with fruits and spices.. In Quebec, a varia- tion on the cottage pie is called ”Paˆte ́ chinois”. It is made with ground beef on the bottom layer, canned corn in the middle, and mashed potato on top.. The ”shepherdess pie” is a vegetarian version made without meat, or a vegan version made without meat and dairy.. In the Netherlands, a very similar dish called ”philosopher’s stew” () often adds ingredients like beans, apples, prunes, or apple sauce.. In Brazil, a dish called in refers to the fact that a manioc puree hides a layer of sun-dried meat."]]
31
+ query = ['what goes on the bottom of shepherd’s pie']
32
+
33
+ pruned_context = provence.process(context, query)
34
+ # print(f"pruned_context)
35
+ # Pruned context: [['Shepherd’s pie. In early cookery books, the dish was a means of using leftover roasted meat of any kind, and the pie dish was lined on the sides and bottom with mashed potato, as well as having a mashed potato crust on top.']]
36
+ ```
37
+
38
+ Training code, as well as RAG experiments with Provence can be found in the [BERGEN](https://github.com/naver/bergen) library.
39
+
40
+ ## Model interface
41
+
42
+ Interface of the `process` function:
43
+ * `questions`: `List[str]`: a list of input questions
44
+ * `contexts`: `List[List[str]]`: a list of retrieved contexts, provided in a list for each question. `len(contexts)` should be equal to `len(questions)`
45
+ * `titles`: `Optional[Union[List[List[str]], str]]`, _default: “first_sentence”_: an optional list of titles for retrieved contexts, same shape as `contexts`. If it is equal to `first_sentence`, then the first sentence of each context is assumed to be the title. If None, then it is assumed that no titles are provided. Titles are only used if `always_select_title=True`.
46
+ * `threshold` _(float, $ \in [0, 1]$, default 0.1)_: which threshold to use for context pruning. We recommend 0.1 for more conservative pruning (no performance drop or lowest performance drops) and 0.5 for higher compression, but this value can be further tuned to meet the specific use case requirements.
47
+ * `always_select_title` _(bool, default: True)_: if True, the first sentence (title) will always be selected. This is important, e.g., for Wikipedia passages, to provide proper context for the next sentences.
48
+ * `batch_size` (int, default: 32)
49
+ * `reorder` _(bool, default: False)_: if True, the provided contexts for each question will be reordered according to the computed question-passage relevance scores. If False, the original user-provided order of contexts will be preserved.
50
+ * `top_k` _(int, default: 5)_: if `reorder=True`, specifies the number of top-ranked passages to keep for each question.
51
+ * `enable_warnings` _(bool, default: True)_: whether the user prefers the warning about model usage to be printed, e.g. too long contexts or questions.
52
+
53
+ ## Model features
54
+
55
+ * **Provence encodes all sentences in the passage together**: this enables capturing of coreferences between sentences and provides more accurate context pruning.
56
+ * **Provence automatically detects the number of sentences to keep**, based on a threshold. We found that the default value of a threshold works well across various domains, but the threshold can be adjusted further to better meet the particular use case needs.
57
+ * **Provence is robust to various domains**, being trained on a combination of diverse MS Marco and NQ data.
58
+ * **Provence works out-of-the-box with any LLM**.
59
+ * **Provence is fast**: we release a standalone DeBERTa-based model [here]() and a unified reranking+context pruning model, which incorporates context pruning into reranking, an already existing stage of modern RAG pipelines. The latter makes context pruning basically zero cost in the RAG pipeline!
60
+
61
+ More details are available in the [blogpost]().
62
+
63
+ ## Model Details
64
+
65
+ * Input: user question (e.g., a sentence) + retrieved context passage (e.g., a paragraph)
66
+ * Output: pruned context passage, i.e., irrelevant sentences are removed + relevance score (can be used for reranking)
67
+ * Model Architecture: The model was initialized from [DeBERTav3-reranker](https://huggingface.co/naver/trecdl22-crossencoder-debertav3) and finetuned with two objectives: (1) output a binary mask which can be used to prune irrelevant sentences; and (2) preserve initial reranking capabilities.
68
+ * Training data: MS Marco (document) + NQ training sets, with synthetic silver labelling of which sentences to keep, produced using [LLama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B).
69
+ * Languages covered: English
70
+ * Context length: 512 tokens (similar to the pretrained DeBERTa model)
71
+ * Evaluation: we evaluate Provence on 7 datasets from various domains: Wikipedia, biomedical data, course syllabi, and news. We find that Provence is able to prune irrelevant sentences with little-to-no drop in performance, in all domains, and outperforms existing baselines on the Pareto front (top right corners of the plots).
72
+
73
+ Check out more analysis in the [paper]()!
74
+
75
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6273df31c3b822dad2d1eef2/WMmfsNG48O830paaBAaQF.png" width="600">
76
+
77
+
78
+ ## License
79
+
80
+ This work is licensed under CC BY-NC 4.0.
81
+
82
+ ## Cite
83
+
84
+ ```
85
+ @misc{chirkova2024provence,
86
+ title={Provence: efficient and robust context pruning for retrieval-augmented generation},
87
+ author={Nadezhda Chirkova and Thibault Formal and Vassilina Nikoulina and Stéphane Clinchant},
88
+ year={2024},
89
+ eprint={?},
90
+ archivePrefix={arXiv},
91
+ primaryClass={cs.CL},
92
+ copyright = {Creative Commons Attribution Non Commercial Share Alike 4.0 International}
93
+ }
94
+ ```
95
+
96
+ ## Acknowledgements
97
+
98
+ Model trained at [Naver Labs Europe](https://europe.naverlabs.com/)
99
+ Team:
100
+ * [Nadia Chirkova](https://nadiinchi.github.io/)
101
+ * [Thibault Formal](https://europe.naverlabs.com/people_user_naverlabs/thibault-formal/)
102
+ * [Vassilina Nikoulina](https://europe.naverlabs.com/people_user_naverlabs/vassilina-nikoulina/)
103
+ * [Stéphane Clinchant](https://europe.naverlabs.com/people_user_naverlabs/st%C3%A9phane-clinchant/)