Dokyoon

leeloolee

Eruly

AI & ML interests

Recent Activity

upvoted a paper 2 days ago

DiffuEraser: A Diffusion Model for Video Inpainting

published a model 6 days ago

leeloolee/gkd-model

reacted to mitkox's post with 👀 10 days ago

Training a model to reason in the continuous latent space based on Meta's Coconut. If it all works will apply it on the MiniCPM-o SVD-LR. Endgame is a multimodal, adaptive, and efficient foundational on device AI model.

View all activity

Organizations

leeloolee's activity

upvoted a paper 2 days ago

DiffuEraser: A Diffusion Model for Video Inpainting

Paper • 2501.10018 • Published 9 days ago • 10

upvoted an article 18 days ago

Article

Context Parallelism

•

Aug 13, 2024

• 13

upvoted a collection 18 days ago

🔍 Interpretability & Analysis of LMs

Collection

Outstanding research in LM interpretability and evaluation, summarized • 95 items • Updated 11 days ago • 96

upvoted a paper 19 days ago

PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models

Paper • 2501.03124 • Published 20 days ago • 14

upvoted 3 papers about 1 month ago

upvoted 2 collections about 1 month ago

Multimodal-SAE

Collection

The collection of the sae that hooked on llava • 4 items • Updated 21 days ago • 5

GUI agents

Collection

A collection of papers on GUI agents • 3 items • Updated Dec 14, 2024 • 5

upvoted 2 papers about 2 months ago

Granite Guardian

Paper • 2412.07724 • Published Dec 10, 2024 • 18

PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published Dec 4, 2024 • 124

upvoted an article about 2 months ago

Article

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

•

Nov 19, 2024

• 11

upvoted a paper 2 months ago

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 58

upvoted an article 2 months ago

Article

Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK

•

Nov 21, 2024

• 35

upvoted 2 papers 2 months ago

M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework

Paper • 2411.06176 • Published Nov 9, 2024 • 45

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions

Paper • 2411.07461 • Published Nov 12, 2024 • 22

upvoted 4 papers 3 months ago

Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

Paper • 2411.05005 • Published Nov 7, 2024 • 13

GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation

Paper • 2410.20474 • Published Oct 27, 2024 • 14

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21, 2024 • 19

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Paper • 2410.17247 • Published Oct 22, 2024 • 45