---
library_name: transformers
license: apache-2.0
datasets:
- Tevatron/msmarco-passage
---

# QWEN2.5 based Setwise reranker fine-tuned on MSMARCO dataset

GitHub repo: https://github.com/ielab/llm-rankers/tree/main/Rank-R1

# Python code examples:

Using [llm-rankers](https://github.com/ielab/llm-rankers) library:
```Python
from llmrankers.setwise import RankR1SetwiseLlmRanker
from llmrankers.rankers import SearchResult

docs = [SearchResult(docid=i, text=f'this is passage {i}', score=None) for i in range(20)]
query = 'Give me passage 6'

ranker = RankR1SetwiseLlmRanker(
    model_name_or_path='Qwen/Qwen2.5-3B-Instruct',
    lora_name_or_path='ielabgroup/Setwise-SFT-3B-v0.1',
    prompt_file='prompt_setwise.toml',
    num_child=19,
    k=1,
    verbose=True
)

print(ranker.rerank(query, docs)[0])
```

The `prompt_setwise.toml` is a .toml file with the following fields:

```toml
prompt_system = "A conversation between User and Assistant. The user asks a question, and the Assistant solves it. The assistant provides the user with the answer enclosed within <answer> </answer> tags, i.e., <answer> answer here </answer>."

prompt_user = '''Given the query: "{query}", which of the following documents is most relevant?
{docs}
Please provide only the label of the most relevant document to the query, enclosed in square brackets, within the answer tags. For example, if the third document is the most relevant, the answer should be: <answer>[3]</answer>.'''

pattern = '<answer>(.*?)</answer>'
```

Internally, the above code is equivalent to the following transformers code:
```Python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig

def get_model(peft_model_name):
    config = PeftConfig.from_pretrained(peft_model_name)
    base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
    model = PeftModel.from_pretrained(base_model, peft_model_name)
    model = model.merge_and_unload()
    return model

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained('Qwen/Qwen2.5-3B-Instruct')
model = get_model('ielabgroup/Setwise-SFT-3B-v0.1').to('cuda:0').eval()

prompt_system = "A conversation between User and Assistant. The user asks a question, and the Assistant solves it. The assistant provides the user with the answer enclosed within <answer> </answer> tags, i.e., <answer> answer here </answer>."
prompt_user = '''Given the query: "{query}", which of the following documents is most relevant?
{docs}
Please provide only the label of the most relevant document to the query, enclosed in square brackets, within the answer tags. For example, if the third document is the most relevant, the answer should be: <answer>[3]</answer>.'''

query = 'Give me passage 6'
docs = [f'[{i+1}] this is passage {i+1}' for i in range(20)]
docs = '\n'.join(docs)

messages = [
    {'role': "system", 'content': prompt_system},
    {'role': "user", 'content': prompt_user.format(query=query, docs=docs)}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to('cuda:0')

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=2048,
    do_sample=False,
)
generated_ids = [
    output_ids[len(input_ids)-1:] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
'''
<answer>[6]</answer>
'''

# extract the answer
import re
pattern = '<answer>(.*?)</answer>'
answer = re.search(pattern, response, re.DOTALL).group(1) # answer = '[6]'
```
> Note that this Setwise rerankers are trained with the prompt format shown above, which includes 20 documents. Other numbers of documents should also work fine, but this would represent a "zero-shot" setting for the model.