hjkim's picture

90

hjkim

hojie11

hojie11

AI & ML interests

Computer Vision, 3D Vision, Anomaly Detection

Recent Activity

upvoted a paper 2 days ago

An Empirical Study of Autoregressive Pre-training from Videos

upvoted a paper 2 days ago

The GAN is dead; long live the GAN! A Modern GAN Baseline

upvoted a paper 3 days ago

Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control

View all activity

Organizations

None yet

hojie11's activity

upvoted 2 papers 2 days ago

An Empirical Study of Autoregressive Pre-training from Videos

Paper • 2501.05453 • Published 3 days ago • 28

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published 3 days ago • 51

upvoted 3 papers 3 days ago

Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control

Paper • 2501.03847 • Published 5 days ago • 17

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published 5 days ago • 54

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Paper • 2501.02955 • Published 6 days ago • 39

upvoted a paper 4 days ago

MagicFace: High-Fidelity Facial Expression Editing with Action-Unit Control

Paper • 2501.02260 • Published 8 days ago • 4

upvoted 3 papers 5 days ago

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation

Paper • 2501.03059 • Published 6 days ago • 18

GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking

Paper • 2501.02690 • Published 7 days ago • 15

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

Paper • 2501.02976 • Published 6 days ago • 46

upvoted 3 papers 6 days ago

Nested Attention: Semantic-aware Attention Values for Concept Personalization

Paper • 2501.01407 • Published 10 days ago • 10

VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control

Paper • 2501.01427 • Published 10 days ago • 46

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published 12 days ago • 40

upvoted a paper 10 days ago

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published 16 days ago • 78

upvoted 3 papers 11 days ago

PERSE: Personalized 3D Generative Avatars from A Single Portrait

Paper • 2412.21206 • Published 13 days ago • 15

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Paper • 2412.18525 • Published 19 days ago • 65

Bringing Objects to Life: 4D generation from 3D objects

Paper • 2412.20422 • Published 14 days ago • 33

upvoted 4 papers 13 days ago

Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage

Paper • 2412.15484 • Published 23 days ago • 14

Revisiting In-Context Learning with Long Context Language Models

Paper • 2412.16926 • Published 21 days ago • 28

Large Motion Video Autoencoding with Cross-modal Video VAE

Paper • 2412.17805 • Published 20 days ago • 24

Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning

Paper • 2412.15797 • Published 23 days ago • 17