-
Analyzing The Language of Visual Tokens
Paper • 2411.05001 • Published • 22 -
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
Paper • 2411.14982 • Published • 15 -
Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration
Paper • 2411.17686 • Published • 18
Jaehyun Jun
btjhjeon
AI & ML interests
Multimodal
Recent Activity
updated
a collection
about 16 hours ago
Multimodal Dataset
upvoted
a
paper
about 16 hours ago
Friends-MMC: A Dataset for Multi-modal Multi-party Conversation
Understanding
upvoted
a
paper
about 16 hours ago
LearnLM: Improving Gemini for Learning
Organizations
Collections
8
-
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Paper • 2410.17637 • Published • 34 -
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Paper • 2411.10442 • Published • 67 -
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Paper • 2411.18203 • Published • 31 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 20
models
None public yet
datasets
None public yet