David Kasakaitis's picture

24 6

David Kasakaitis

dkasa

·

https://dkasa.dev

ddkasa

AI & ML interests

Reinforcement Learning & Autonomous Agents

Organizations

None yet

dkasa's activity

upvoted a paper 16 days ago

A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents

Paper • 2410.22476 • Published 19 days ago • 24

upvoted a paper 20 days ago

ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting

Paper • 2410.17856 • Published 25 days ago • 49

upvoted a paper 22 days ago

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published 26 days ago • 88

upvoted a paper 23 days ago

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Paper • 2410.17247 • Published 26 days ago • 43

upvoted a paper 29 days ago

Harnessing Webpage UIs for Text-Rich Visual Understanding

Paper • 2410.13824 • Published Oct 17 • 29

upvoted 9 papers about 1 month ago

Revealing the Barriers of Language Agents in Planning

Paper • 2410.12409 • Published Oct 16 • 23

Large Language Model Evaluation via Matrix Nuclear-Norm

Paper • 2410.10672 • Published Oct 14 • 18

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

Paper • 2410.09732 • Published Oct 13 • 54

Mechanistic Permutability: Match Features Across Layers

Paper • 2410.07656 • Published Oct 10 • 16

Baichuan-Omni Technical Report

Paper • 2410.08565 • Published Oct 11 • 83

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents

Paper • 2410.03450 • Published Oct 4 • 35

Only-IF:Revealing the Decisive Effect of Instruction Diversity on Generalization

Paper • 2410.04717 • Published Oct 7 • 17

A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation

Paper • 2410.01912 • Published Oct 2 • 13

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 165

upvoted 6 papers about 2 months ago

Imagine yourself: Tuning-Free Personalized Image Generation

Paper • 2409.13346 • Published Sep 20 • 67

YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models

Paper • 2409.13592 • Published Sep 20 • 48

Prithvi WxC: Foundation Model for Weather and Climate

Paper • 2409.13598 • Published Sep 20 • 37

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization

Paper • 2409.12903 • Published Sep 19 • 21

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19 • 134

LLMs + Persona-Plug = Personalized LLMs

Paper • 2409.11901 • Published Sep 18 • 30