InternLM-XComposer-2.5-Reward
Introduction
InternLM-XComposer2.5-Reward is a multi-modal reward model trained on the foundation of internlm/internlm-xcomposer2d5-7b. This model has been trained using preference samples across text, image and video domains, and assigning appropriate reward scores that align with human preferences.
Performance Evaluation
Result on VLRewardBench
Models General Hallucination Reasoning Overall Macro InternLM-XComposer2.5-7B-Reward 84.7 62.5 62.9 65.8 70.0 Result on RewardBench
Models Score Chat Chat Hard Safety Reasoning InternLM-XComposer2.5-7B-Reward 88.6 90.8 83.8 87.8 90.0 Result on RM-Bench
Models Chat Math Code Safety Easy Normal Hard Average InternLM-XComposer2.5-7B-Reward 65.5 55.9 51.7 93.8 87.5 71.3 47.4 68.8
Basic Usage
Here is an example of how to use the model to get the reward score of a chat, compare two chats, or rank multiple chats.
import torch
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained(
"internlm/internlm-xcomposer2d5-7b-reward",
device_map="cuda",
torch_dtype=torch.float16,
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-xcomposer2d5-7b-reward", trust_remote_code=True)
model.tokenizer = tokenizer
chat_1 = [
{"role": "user", "content": 'I want to buy a car from the input image, analyze the advantages and weaknesses.'},
{"role": "assistant", "content": "The car in the image is a Mercedes-Benz G-Class. This is a luxury SUV known for its ruggedness, off-road capability, and iconic boxy design. Here's an analysis of its advantages and weaknesses: Advantages: Off-road capability: The G-Class is built on a robust ladder frame chassis and has a powerful engine, making it highly capable in off-road conditions. It features a low-range transfer case, locking differentials, and a high ground clearance. Luxury and comfort: Despite its rugged exterior, the G-Class offers a luxurious interior with high-quality materials, comfortable seats, and advanced technology features. Iconic design: The distinctive boxy shape of the G-Class has made it a cult classic. It's instantly recognizable and has a timeless appeal. Strong resale value: Due to its exclusivity and desirability, the G-Class tends to hold its value well over time. Weaknesses: Fuel efficiency: The G-Class is not known for its fuel efficiency, especially with the larger engine options. Ride quality: The firm suspension is designed for off-road performance, which can make the ride a bit bumpy on paved roads. Price: The G-Class is a very expensive vehicle, even for a luxury SUV. Size and maneuverability: The large size of the G-Class can make it challenging to park and maneuver in tight spaces."}
]
chat_2 = [
{"role": "user", "content": 'I want to buy a car from the input image, analyze the advantages and weaknesses.'},
{"role": "assistant", "content": "Based on the image, it appears to be a Ferrari F8 Tributo. Advantages: Performance: Ferraris are renowned for their exceptional performance. The F8 Tributo is no exception, boasting a powerful twin-turbocharged V8 engine that delivers exhilarating acceleration and top speed. Handling: Ferraris are known for their precise handling and agility. The F8 Tributo continues this tradition, with advanced aerodynamics and sophisticated suspension systems ensuring a thrilling driving experience. Design: Ferrari is synonymous with iconic and head-turning designs. The F8 Tributo is a testament to this, featuring sleek lines, aggressive styling, and a captivating presence. Exclusivity: Owning a Ferrari is a status symbol. It represents luxury, exclusivity, and a passion for high-performance automobiles. Weaknesses: Price: Ferraris come with a hefty price tag. The F8 Tributo is no exception, making it an investment for those with significant financial resources. Fuel Efficiency: High-performance sports cars like the F8 Tributo are not known for their fuel efficiency. You can expect lower miles per gallon compared to everyday vehicles. Maintenance Costs: Owning a Ferrari comes with associated maintenance costs, which can be higher than those for regular cars. Practicality: The F8 Tributo is primarily a two-seater sports car, making it less practical for everyday use or carrying passengers. Ride Comfort: While the F8 Tributo offers a thrilling driving experience, its stiff suspension might not be ideal for long-distance comfort."}
]
image = ['./examples/cars1.jpg']
hd_num = 9
# get reward score for a single chat
with torch.autocast(device_type='cuda', dtype=torch.float16):
score1 = model.get_score(chat_1, image, hd_num=hd_num)
score2 = model.get_score(chat_2, image, hd_num=hd_num)
print("score1: ", score1)
print("score2: ", score2)
# >>> score1: 5.76
# >>> score2: -2.84375
# batch inference, get multiple scores at once
with torch.autocast(device_type='cuda', dtype=torch.float16):
scores = model.get_scores([chat_1, chat_2], [image, image], hd_num=hd_num)
print("scores: ", scores)
# >>> scores: [5.76171875, -2.845703125]
# compare whether chat_1 is better than chat_2
with torch.autocast(device_type='cuda', dtype=torch.float16):
compare_res = model.compare(chat_1, image, chat_2, image, hd_num=hd_num)
print("compare_res: ", compare_res)
# >>> compare_res: True
# rank multiple chats, it will return the ranking index of each chat
# the chat with the highest score will have ranking index as 0
with torch.autocast(device_type='cuda', dtype=torch.float16):
rank_res = model.rank([chat_1, chat_2], [image, image], hd_num=hd_num)
print("rank_res: ", rank_res) # lower index means higher score
# >>> rank_res: [0, 1]
Open Source License
The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/申请表(中文). For other questions or collaborations, please contact [email protected].
- Downloads last month
- 269