116 27 23

Wenhu Chen

wenhu

https://wenhuchen.github.io

AI & ML interests

NLP

Recent Activity

commented on a paper about 2 hours ago

PixelWorld: Towards Perceiving Everything as Pixels

updated a dataset about 14 hours ago

TIGER-Lab/PixelWorld

updated a collection 1 day ago

CritiqueFineTuning

View all activity

Organizations

wenhu's activity

upvoted a paper 5 days ago

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published 5 days ago • 45

upvoted a paper about 2 months ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 47

upvoted a paper 2 months ago

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Paper • 2412.00927 • Published Dec 1, 2024 • 26

upvoted 2 papers 4 months ago

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Paper • 2410.10563 • Published Oct 14, 2024 • 38

VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks

Paper • 2410.05160 • Published Oct 7, 2024 • 4

upvoted a paper 5 months ago

Foundation Models for Music: A Survey

Paper • 2408.14340 • Published Aug 26, 2024 • 44

upvoted a paper 7 months ago

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21, 2024 • 64

upvoted 4 papers 8 months ago

Unifying Multimodal Retrieval via Document Screenshot Embedding

Paper • 2406.11251 • Published Jun 17, 2024 • 10

GenAI Arena: An Open Evaluation Platform for Generative Models

Paper • 2406.04485 • Published Jun 6, 2024 • 21

T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback

Paper • 2405.18750 • Published May 29, 2024 • 21

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Paper • 2406.01574 • Published Jun 3, 2024 • 45

upvoted a paper 9 months ago

MANTIS: Interleaved Multi-Image Instruction Tuning

Paper • 2405.01483 • Published May 2, 2024 • 6

upvoted 2 papers 10 months ago

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Paper • 2404.03543 • Published Apr 4, 2024 • 16

Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2, 2024 • 36

upvoted a paper 11 months ago

AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks

Paper • 2403.14468 • Published Mar 21, 2024 • 24

upvoted a collection 11 months ago

StructLM

Collection

The structure knowledge grounded language model • 6 items • Updated Apr 6, 2024 • 7

upvoted 2 papers 11 months ago

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25, 2024 • 58

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Paper • 2402.16671 • Published Feb 26, 2024 • 27

upvoted a paper 12 months ago

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation

Paper • 2402.04324 • Published Feb 6, 2024 • 24

upvoted a paper about 1 year ago

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

Paper • 2401.11944 • Published Jan 22, 2024 • 27