OS-ATLAS: A Foundation Action Model for Generalist GUI Agents Paper • 2410.23218 • Published 10 days ago • 43 • 3
Language Models can Self-Lengthen to Generate Long Texts Paper • 2410.23933 • Published 10 days ago • 15 • 3
A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents Paper • 2410.22476 • Published 11 days ago • 24 • 3
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper • 2410.23743 • Published 10 days ago • 57 • 4
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper • 2410.22366 • Published 12 days ago • 71 • 3
HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale Paper • 2406.19280 • Published Jun 27 • 60 • 9
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22 • 45 • 8
Evaluating D-MERIT of Partial-annotation on Information Retrieval Paper • 2406.16048 • Published Jun 23 • 34 • 2
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Paper • 2406.16855 • Published Jun 24 • 54 • 3
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model Paper • 2406.04333 • Published Jun 6 • 36 • 3
Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step Paper • 2406.04314 • Published Jun 6 • 26 • 2
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6 • 71 • 4
Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning Paper • 2406.00392 • Published Jun 1 • 12 • 1
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation Paper • 2406.00908 • Published Jun 3 • 11 • 1
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper • 2406.00888 • Published Jun 2 • 30 • 1
Learning Temporally Consistent Video Depth from Video Diffusion Priors Paper • 2406.01493 • Published Jun 3 • 17 • 2
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark Paper • 2406.01574 • Published Jun 3 • 42 • 3