Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
4
Kyusong Lee
kyusonglee
Follow
ruochenx's profile picture
tianchez's profile picture
2 followers
ยท
4 following
AI & ML interests
None yet
Recent Activity
liked
a Space
5 days ago
omlab/VLM-R1-Referral-Expression
reacted
to
tianchez
's
post
with ๐
9 days ago
Introducing VLM-R1! GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks? The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task). https://github.com/om-ai-lab/VLM-R1
authored
a paper
about 1 month ago
OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer
View all activity
Organizations
Papers
1
arxiv:
2406.16620
models
None public yet
datasets
None public yet