Wujian Peng's picture

23 17

Wujian Peng

wjpoom

AI & ML interests

None yet

Organizations

wjpoom's activity

upvoted a paper 3 months ago

RelBench: A Benchmark for Deep Learning on Relational Databases

Paper • 2407.20060 • Published Jul 29 • 7

upvoted 4 papers 4 months ago

Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

Paper • 2407.16982 • Published Jul 24 • 40

Understanding Reference Policies in Direct Preference Optimization

Paper • 2407.13709 • Published Jul 18 • 16

Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

Paper • 2407.07053 • Published Jul 9 • 41

Video Diffusion Alignment via Reward Gradients

Paper • 2407.08737 • Published Jul 11 • 47

upvoted an article 4 months ago

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15

• 165

upvoted 4 papers 4 months ago

We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?

Paper • 2407.01284 • Published Jul 1 • 75

Multi-Object Hallucination in Vision-Language Models

Paper • 2407.06192 • Published Jul 8 • 9

UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Paper • 2407.05282 • Published Jul 7 • 12

Compositional Video Generation as Flow Equalization

Paper • 2407.06182 • Published Jun 10 • 12

upvoted 10 papers 5 months ago

Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning

Paper • 2406.12742 • Published Jun 18 • 14

Improving Visual Commonsense in Language Models via Multiple Image Generation

Paper • 2406.13621 • Published Jun 19 • 13

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

Paper • 2406.13542 • Published Jun 19 • 16

Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs

Paper • 2406.14544 • Published Jun 20 • 34

MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation

Paper • 2406.15252 • Published Jun 21 • 14

On the Transformations across Reward Model, Parameter Update, and In-Context Prompt

Paper • 2406.16377 • Published Jun 24 • 11

Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning

Paper • 2406.09170 • Published Jun 13 • 24

MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

Paper • 2406.09411 • Published Jun 13 • 18

Depth Anything V2

Paper • 2406.09414 • Published Jun 13 • 92

Searching Priors Makes Text-to-Video Synthesis Better

Paper • 2406.03215 • Published Jun 5 • 11