KW's picture

64 1070

KW

kevineen

·

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

ByteDance/Sa2VA-8B

liked a dataset 2 days ago

hpprc/kaken-trans-ja-en

liked a model 2 days ago

llamaindex/vdr-2b-multi-v1

View all activity

Organizations

kevineen's activity

upvoted a collection 3 days ago

TACO Models

This collection contains the best-performing TACO models based on LLaMA-3/Qwen2 and SigLIP/CLIP. • 3 items • Updated 23 days ago • 8

upvoted 2 papers 4 days ago

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published 5 days ago • 66

TransPixar: Advancing Text-to-Video Generation with Transparency

Paper • 2501.03006 • Published 6 days ago • 19

upvoted a paper 7 days ago

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published 9 days ago • 34

upvoted a paper 8 days ago

Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion

Paper • 2412.04424 • Published Dec 5, 2024 • 59

upvoted a paper 10 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published about 1 month ago • 137

upvoted a paper 12 days ago

1.58-bit FLUX

Paper • 2412.18653 • Published 19 days ago • 69

upvoted a collection 13 days ago

YuLan-Mini

A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details. • 5 items • Updated 14 days ago • 10

upvoted a paper 18 days ago

Large Motion Video Autoencoding with Cross-modal Video VAE

Paper • 2412.17805 • Published 20 days ago • 24

upvoted a paper 20 days ago

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Paper • 2412.04454 • Published Dec 5, 2024 • 59

upvoted 2 papers 24 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 24 days ago • 339

AniDoc: Animation Creation Made Easier

Paper • 2412.14173 • Published 25 days ago • 49

upvoted a collection about 1 month ago

InternVL2.5

Better than InternVL 2.0 • 18 items • Updated 3 days ago • 79

upvoted 2 collections about 2 months ago

LLM-jp-3 Pre-trained Models

Pre-trained models in the LLM-jp-3 model series • 4 items • Updated 20 days ago • 5

AIMv2

A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 69

upvoted a paper 2 months ago

xLSTM: Extended Long Short-Term Memory

Paper • 2405.04517 • Published May 7, 2024 • 12

upvoted an article 3 months ago

Article

🧨 Diffusers welcomes Stable Diffusion 3.5 Large

Oct 22, 2024

• 49

upvoted a collection 3 months ago

Gemma-APS Release

Gemma models for text-to-propositions segmentation. The models are distilled from fine-tuned Gemini Pro model applied to multi-domain synthetic data. • 3 items • Updated about 1 month ago • 20

upvoted 2 papers 3 months ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 169

Loong: Generating Minute-level Long Videos with Autoregressive Language Models

Paper • 2410.02757 • Published Oct 3, 2024 • 36