2 15 28

Ivy Zhang

Ivy1997

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

new activity 3 days ago

BAAI/Infinity-MM:ChartQA，DocVQA，InfoVQA 等明显低于汇报结果

liked a model 7 days ago

Qwen/Qwen2.5-VL-3B-Instruct

View all activity

Organizations

Ivy1997's activity

liked a model 3 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Text Generation • Updated 3 days ago • 354k • • 647

New activity in BAAI/Infinity-MM 3 days ago

ChartQA，DocVQA，InfoVQA 等明显低于汇报结果

#10 opened 13 days ago by

Ivy1997

liked 3 models 7 days ago

upvoted a collection 8 days ago

Qwen2-VL

Collection

Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 200

liked 2 models 8 days ago

Qwen/Qwen2.5-14B-Instruct-1M

Text Generation • Updated 6 days ago • 9.13k • 211

Qwen/Qwen2.5-7B-Instruct-1M

Text Generation • Updated 6 days ago • 22.5k • 169

upvoted 3 papers 10 days ago

EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion

Paper • 2501.13452 • Published 12 days ago • 7

Temporal Preference Optimization for Long-Form Video Understanding

Paper • 2501.13919 • Published 11 days ago • 21

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published 11 days ago • 22

liked 2 Spaces 11 days ago

Running on CPU Upgrade

590

🌎

Open VLM Leaderboard

VLMEvalKit Evaluation Results Collection

Running

🥇

MEGA-Bench

A leaderboard for multimodal models

liked a model 25 days ago

OpenGVLab/InternVL2_5-1B

Image-Text-to-Text • Updated Dec 18, 2024 • 19.6k • 43

liked a model 26 days ago

lmms-lab/llava-onevision-qwen2-7b-si

Text Generation • Updated Sep 2, 2024 • 15.5k • 12

liked a dataset 26 days ago

FreedomIntelligence/medical-o1-reasoning-SFT

Viewer • Updated 22 days ago • 50.1k • 2.62k • 107

liked a model 26 days ago

microsoft/phi-4

Text Generation • Updated 26 days ago • 375k • 1.65k

upvoted a collection 27 days ago

AIMv2

Collection

A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 71

liked a dataset 29 days ago

HuggingFaceFV/finevideo

Viewer • Updated Dec 16, 2024 • 39.5k • 2.38k • 291

liked a dataset about 1 month ago

nkp37/OpenVid-1M

Viewer • Updated Aug 23, 2024 • 1.45M • 16.3k • 176