15 13 19

Rui Yang

Ray2333

https://yangrui2015.github.io

YangRui2015

AI & ML interests

Deep Reinforcement Learning

Recent Activity

new activity about 7 hours ago

microsoft/Magma-8B:generation_args in the example

upvoted a paper 1 day ago

R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts

upvoted a paper 1 day ago

Self-rewarding correction for mathematical reasoning

View all activity

Organizations

Ray2333's activity

New activity in microsoft/Magma-8B about 7 hours ago

generation_args in the example

#10 opened about 8 hours ago by

Ray2333

New activity in EmbodiedBench/EB-Manipulation 7 days ago

Add dataset card

#1 opened 7 days ago by

nielsr

New activity in Ray2333/Gemma-2B-rewardmodel-baseline 10 days ago

trained dataset and fine-tuned method

#1 opened 10 days ago by

glgjss960

commented a paper 11 days ago

Rethinking Diverse Human Preference Learning through Principal Component Analysis

Paper • 2502.13131 • Published 11 days ago • 34 •

commented a paper 15 days ago

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

Paper • 2502.09560 • Published 16 days ago • 32 •

New activity in Ray2333/GRM-Llama3.2-3B-rewardmodel-ft 24 days ago

Update default tokenization behavior to "longest" in README

#2 opened 24 days ago by

MichaelR207

New activity in Ray2333/GRM-Llama3.2-3B-rewardmodel-ft 4 months ago

Model Size

#1 opened 4 months ago by

szhang120

commented a paper 4 months ago

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published Oct 29, 2024 • 15 •

New activity in Ray2333/GRM-llama3-8B-sftreg 4 months ago

Adding `safetensors` variant of this model

#3 opened 4 months ago by

SFconvertbot

New activity in Ray2333/GRM-llama3-8B-sftreg 6 months ago

Abnormally Large Memory Footprint?

#2 opened 6 months ago by

RylanSchaeffer

Some weights of the model checkpoint at Ray2333/GRM-llama3-8B-sftreg were not used when initializing

#1 opened 6 months ago by

RylanSchaeffer

New activity in Ray2333/gpt2-large-harmless-reward_model 7 months ago

Load failed:There is no "pytorch_model.bin", how to load the model?

#3 opened 7 months ago by

Hanlard

a bug when loading model

#2 opened 8 months ago by

ssmmzz

New activity in Ray2333/RiC_harmless_helpful 8 months ago

Librarian Bot: Add language metadata for dataset

#2 opened 8 months ago by

librarian-bot

New activity in Ray2333/gpt2-large-harmless-reward_model 11 months ago

How to train the model

#1 opened 11 months ago by

mike2000