RoSteALS: Robust Steganography using Autoencoder Latent Space Paper • 2304.03400 • Published Apr 6, 2023
On the Proactive Generation of Unsafe Images From Text-To-Image Models Using Benign Prompts Paper • 2310.16613 • Published Oct 25, 2023
X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning Paper • 2311.18799 • Published Nov 30, 2023 • 1
C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models Paper • 2402.03181 • Published Feb 5, 2024
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content Paper • 2403.13031 • Published Mar 19, 2024 • 1
Keys to Better Image Inpainting: Structure and Texture Go Hand in Hand Paper • 2208.03382 • Published Aug 5, 2022
Membership Inference Attacks Against Text-to-image Generation Models Paper • 2210.00968 • Published Oct 3, 2022
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Paper • 2408.08872 • Published Aug 16, 2024 • 98
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer Paper • 2212.09877 • Published Dec 19, 2022
Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise Paper • 2501.08331 • Published 26 days ago • 20
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild Paper • 2305.11147 • Published May 18, 2023 • 3
ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Paper • 2305.08275 • Published May 14, 2023 • 2