Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion Paper • 2501.09019 • Published 4 days ago • 10
Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding Paper • 2501.07888 • Published 5 days ago • 13
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations Paper • 2412.13171 • Published Dec 17, 2024 • 31
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements Paper • 2410.08968 • Published Oct 11, 2024 • 12
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 13 days ago • 293
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models Paper • 2409.11136 • Published Sep 17, 2024 • 22
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models Paper • 2409.11136 • Published Sep 17, 2024 • 22 • 2
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling Paper • 2408.03695 • Published Aug 7, 2024 • 13
Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation Paper • 2308.07316 • Published Aug 14, 2023 • 6
Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation Paper • 2308.07316 • Published Aug 14, 2023 • 6