-
Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Paper • 2403.12943 • Published • 14 -
Masked Audio Generation using a Single Non-Autoregressive Transformer
Paper • 2401.04577 • Published • 42 -
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models
Paper • 2404.02747 • Published • 11 -
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation
Paper • 2404.02733 • Published • 20
Collections
Discover the best community collections!
Collections including paper arxiv:2404.14239
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 25 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 12 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 38 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 19
-
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
Paper • 2404.13686 • Published • 27 -
MultiBooth: Towards Generating All Your Concepts in an Image from Text
Paper • 2404.14239 • Published • 8 -
Stylus: Automatic Adapter Selection for Diffusion Models
Paper • 2404.18928 • Published • 14
-
Condition-Aware Neural Network for Controlled Image Generation
Paper • 2404.01143 • Published • 11 -
FlexiDreamer: Single Image-to-3D Generation with FlexiCubes
Paper • 2404.00987 • Published • 21 -
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 44 -
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
Paper • 2404.02893 • Published • 20
-
LIMA: Less Is More for Alignment
Paper • 2305.11206 • Published • 21 -
Garment3DGen: 3D Garment Stylization and Texture Generation
Paper • 2403.18816 • Published • 21 -
EgoLifter: Open-world 3D Segmentation for Egocentric Perception
Paper • 2403.18118 • Published • 10 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 78