Rank-R1
Collection
Setwise rankers with reasoning ability
•
6 items
•
Updated
GitHub repo: https://github.com/ielab/llm-rankers/tree/main/Rank-R1
Using llm-rankers library:
from llmrankers.setwise import RankR1SetwiseLlmRanker
from llmrankers.rankers import SearchResult
docs = [SearchResult(docid=i, text=f'this is passage {i}', score=None) for i in range(20)]
query = 'Give me passage 6'
ranker = RankR1SetwiseLlmRanker(
model_name_or_path='Qwen/Qwen2.5-3B-Instruct',
lora_name_or_path='ielabgroup/Setwise-SFT-3B-v0.1',
prompt_file='prompt_setwise.toml',
num_child=19,
k=1,
verbose=True
)
print(ranker.rerank(query, docs)[0])
The prompt_setwise.toml
is a .toml file with the following fields:
prompt_system = "A conversation between User and Assistant. The user asks a question, and the Assistant solves it. The assistant provides the user with the answer enclosed within <answer> </answer> tags, i.e., <answer> answer here </answer>."
prompt_user = '''Given the query: "{query}", which of the following documents is most relevant?
{docs}
Please provide only the label of the most relevant document to the query, enclosed in square brackets, within the answer tags. For example, if the third document is the most relevant, the answer should be: <answer>[3]</answer>.'''
pattern = '<answer>(.*?)</answer>'
Internally, the above code is equivalent to the following transformers code:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig
def get_model(peft_model_name):
config = PeftConfig.from_pretrained(peft_model_name)
base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(base_model, peft_model_name)
model = model.merge_and_unload()
return model
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained('Qwen/Qwen2.5-3B-Instruct')
model = get_model('ielabgroup/Setwise-SFT-3B-v0.1').to('cuda:0').eval()
prompt_system = "A conversation between User and Assistant. The user asks a question, and the Assistant solves it. The assistant provides the user with the answer enclosed within <answer> </answer> tags, i.e., <answer> answer here </answer>."
prompt_user = '''Given the query: "{query}", which of the following documents is most relevant?
{docs}
Please provide only the label of the most relevant document to the query, enclosed in square brackets, within the answer tags. For example, if the third document is the most relevant, the answer should be: <answer>[3]</answer>.'''
query = 'Give me passage 6'
docs = [f'[{i+1}] this is passage {i+1}' for i in range(20)]
docs = '\n'.join(docs)
messages = [
{'role': "system", 'content': prompt_system},
{'role': "user", 'content': prompt_user.format(query=query, docs=docs)}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to('cuda:0')
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=2048,
do_sample=False,
)
generated_ids = [
output_ids[len(input_ids)-1:] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
'''
<answer>[6]</answer>
'''
# extract the answer
import re
pattern = '<answer>(.*?)</answer>'
answer = re.search(pattern, response, re.DOTALL).group(1) # answer = '[6]'
Note that this Setwise rerankers are trained with the prompt format shown above, which includes 20 documents. Other numbers of documents should also work fine, but this would represent a "zero-shot" setting for the model.