LB Reranker v1.0
This is a reversed version of the original LB Reranker - (lightblue/lb-reranker-0.5B-v1.0)[https://huggingface.co/lightblue/lb-reranker-0.5B-v1.0]. With this version, you input the text, then the query into the reranker, allowing for caching of the text instead of the query.
The LB Reranker has been trained to determine the relatedness of a given query to a piece of text, therefore allowing it to be used as a ranker or reranker in various retrieval-based tasks.
This model is fine-tuned from a Qwen/Qwen2.5-0.5B-Instruct model checkpoint and was trained for roughly 5.5 hours using the 8 x L20 instance (ecs.gn8is-8x.32xlarge) on Alibaba Cloud.
The training data for this model can be found at lightblue/reranker_continuous_filt_max7_train and the code for generating this data as well as running the training of the model can be found on our Github repo.
Trained on data in over 95 languages, this model is applicable to a broad range of use cases.
This model has three main benefits over comparable rerankers.
- It has shown slightly higher performance on evaluation benchmarks.
- It has been trained on more languages than any previous model.
- It is a simple Causal LM model trained to output a string between "1" and "7".
This last point means that this model can be used natively with many widely available inference packages, including vLLM and LMDeploy. This in turns allows our reranker to benefit from improvements to inference as and when these packages release them.
Update: We have also found that this model works pretty well as a code snippet reranker too (P@1 of 96%)! See our Colab for more details.
How to use
The model was trained to expect an input such as:
<<<Context>>>
{your_context_here}
<<<Query>>>
{your_query_here}
And to output a string of a number between 1-7.
In order to make a continuous score that can be used for reranking query-context pairs (i.e. a method with few ties), we calculate the expectation value of the scores.
We include scripts to do this in vLLM, LMDeploy, and OpenAI (hosted for free on Huggingface):
- vLLM
Install vLLM using
pip install vllm
.Show vLLM code
from vllm import LLM, SamplingParams import numpy as np def make_reranker_input(t, q): return f"<<<Context>>>\n{t}\n\n<<<Query>>>\n{q}" def make_reranker_inference_conversation(context, question): system_message = "Given a piece of text and a query, output a score of 1-7 based on how related the query is to the text. 1 means least related and 7 is most related." return [ {"role": "system", "content": system_message}, {"role": "user", "content": make_reranker_input(context, question)}, ] def get_prob(logprob_dict, tok_id): return np.exp(logprob_dict[tok_id].logprob) if tok_id in logprob_dict.keys() else 0 llm = LLM("lightblue/lb-reranker-0.5B-v1.0-rev") sampling_params = SamplingParams(temperature=0.0, logprobs=14, max_tokens=1) tok = llm.llm_engine.tokenizer.tokenizer idx_tokens = [tok.encode(str(i))[0] for i in range(1, 8)] query_texts = [ ("What is the scientific name of apples?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."), ("What is the Chinese word for 'apple'?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."), ("What is the square root of 999?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."), ] chats = [make_reranker_inference_conversation(c, q) for q, c in query_texts] responses = llm.chat(chats, sampling_params) probs = np.array([[get_prob(r.outputs[0].logprobs[0], y) for y in idx_tokens] for r in responses]) N = probs.shape[1] M = probs.shape[0] idxs = np.tile(np.arange(1, N + 1), M).reshape(M, N) expected_vals = (probs * idxs).sum(axis=1) print(expected_vals) # [6.66570732 1.86686378 1.01102923]
- LMDeploy
Install LMDeploy using
pip install lmdeploy
.Show LMDeploy code
# Un-comment this if running in a Jupyter notebook, Colab etc. # import nest_asyncio # nest_asyncio.apply() from lmdeploy import GenerationConfig, ChatTemplateConfig, pipeline import numpy as np def make_reranker_input(t, q): return f"<<<Context>>>\n{t}\n\n<<<Query>>>\n{q}" def make_reranker_inference_conversation(context, question): system_message = "Given a piece of text and a query, output a score of 1-7 based on how related the query is to the text. 1 means least related and 7 is most related." return [ {"role": "system", "content": system_message}, {"role": "user", "content": make_reranker_input(context, question)}, ] def get_prob(logprob_dict, tok_id): return np.exp(logprob_dict[tok_id]) if tok_id in logprob_dict.keys() else 0 pipe = pipeline( "lightblue/lb-reranker-0.5B-v1.0-rev", chat_template_config=ChatTemplateConfig( model_name='qwen2d5', capability='chat' ) ) tok = pipe.tokenizer.model idx_tokens = [tok.encode(str(i))[0] for i in range(1, 8)] query_texts = [ ("What is the scientific name of apples?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."), ("What is the Chinese word for 'apple'?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."), ("What is the square root of 999?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."), ] chats = [make_reranker_inference_conversation(c, q) for q, c in query_texts] responses = pipe( chats, gen_config=GenerationConfig(temperature=1.0, logprobs=14, max_new_tokens=1, do_sample=True) ) probs = np.array([[get_prob(r.logprobs[0], y) for y in idx_tokens] for r in responses]) N = probs.shape[1] M = probs.shape[0] idxs = np.tile(np.arange(1, N + 1), M).reshape(M, N) expected_vals = (probs * idxs).sum(axis=1) print(expected_vals) # [6.66415229 1.84342025 1.01133205]
- OpenAI (Hosted on Huggingface)
Install openai using
pip install openai
.Show OpenAI + Huggingface Inference code
from openai import OpenAI import numpy as np from multiprocessing import Pool from tqdm.auto import tqdm client = OpenAI( base_url="https://api-inference.huggingface.co/v1/", api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # Change this to an access token from https://huggingface.co/settings/tokens ) def make_reranker_input(t, q): return f"<<<Context>>>\n{t}\n\n<<<Query>>>\n{q}" def make_reranker_inference_conversation(context, question): system_message = "Given a piece of text and a query, output a score of 1-7 based on how related the query is to the text. 1 means least related and 7 is most related." return [ {"role": "system", "content": system_message}, {"role": "user", "content": make_reranker_input(context, question)}, ] def get_reranker_score(context_question_tuple): question, context = context_question_tuple messages = make_reranker_inference_conversation(context, question) completion = client.chat.completions.create( model="lightblue/lb-reranker-0.5B-v1.0-rev", messages=messages, max_tokens=1, temperature=0.0, logprobs=True, top_logprobs=5, # Max allowed by the openai API as top_n_tokens must be >= 0 and <= 5. If this gets changed, fix to > 7. ) logprobs = completion.choices[0].logprobs.content[0].top_logprobs calculated_score = sum([int(x.token) * np.exp(x.logprob) for x in logprobs]) return calculated_score query_texts = [ ("What is the scientific name of apples?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."), ("What is the Chinese word for 'apple'?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."), ("What is the square root of 999?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."), ] with Pool(processes=16) as p: # Allows for parallel processing expected_vals = list(tqdm(p.imap(get_reranker_score, query_texts), total=len(query_texts))) print(expected_vals) # [6.64866580, 1.85144404, 1.010719508]
License
We share this model under an Apache 2.0 license.
Developed by
![Lightblue technology logo](https://www.lightblue-tech.com/wp-content/uploads/2023/08/color_%E6%A8%AA%E5%9E%8B-1536x469.png)
This model was trained by Peter Devine (ptrdvn) for Lightblue
- Downloads last month
- 19