1 16 107

Jonathan Korstad PRO

jkorstad

AI & ML interests

Deep Reinforcement Learning, Generative 3D, Accessibility, Multimodal Models, Agents, Computer Vision. Staying curious.

Recent Activity

liked a Space about 22 hours ago

HuggingFaceTB/SmolVLM

updated a collection about 22 hours ago

WebGPU Models

liked a Space about 22 hours ago

HuggingFaceTB/SmolLM2-1.7B-Instruct-WebGPU

View all activity

Organizations

jkorstad's activity

upvoted an article 2 days ago

Article

We now support VLMs in smolagents!

3 days ago

• 31

upvoted an article 10 days ago

Article

Run ComfyUI workflows for free on Spaces

Jan 14, 2024

• 42

upvoted a paper about 1 month ago

Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Paper • 2412.15322 • Published Dec 19, 2024 • 18

upvoted a paper 3 months ago

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published Nov 11, 2024 • 28

upvoted a collection 3 months ago

OpenCoder

Collection

OpenCoder is an open and reproducible code LLM family which matches the performance of top-tier code LLMs. • 8 items • Updated Nov 23, 2024 • 80

upvoted 2 papers 3 months ago

PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance

Paper • 2411.02327 • Published Nov 4, 2024 • 11

GenXD: Generating Any 3D and 4D Scenes

Paper • 2411.02319 • Published Nov 4, 2024 • 20

upvoted a collection 3 months ago

MIT Talk 31/10 Papers

Collection

14 items • Updated Oct 28, 2024 • 31

upvoted a paper 3 months ago

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14, 2024 • 54

upvoted 2 papers 4 months ago

PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation

Paper • 2409.18964 • Published Sep 27, 2024 • 26

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 136

upvoted 2 papers 5 months ago

Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4, 2024 • 92

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published Aug 28, 2024 • 86

upvoted an article 6 months ago

Article

License to Call: Introducing Transformers Agents 2.0

May 13, 2024

• 126