Papers - a dynamicjerry Collection

dynamicjerry 's Collections

Papers

Papers

updated Dec 23, 2024

Gemini: A Family of Highly Capable Multimodal Models

Paper • 2312.11805 • Published Dec 19, 2023 • 45
Unlocking Pre-trained Image Backbones for Semantic Image Synthesis

Paper • 2312.13314 • Published Dec 20, 2023 • 8
LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 258
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit

Paper • 2312.09911 • Published Dec 15, 2023 • 54
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models

Paper • 2312.09608 • Published Dec 15, 2023 • 14
VecFusion: Vector Font Generation with Diffusion

Paper • 2312.10540 • Published Dec 16, 2023 • 21
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing

Paper • 2312.11392 • Published Dec 18, 2023 • 19
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models

Paper • 2312.14091 • Published Dec 21, 2023 • 16
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models

Paper • 2410.02416 • Published Oct 3, 2024 • 26
FashionComposer: Compositional Fashion Image Generation

Paper • 2412.14168 • Published Dec 18, 2024 • 16
No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 41
Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 139
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion

Paper • 2412.09626 • Published Dec 12, 2024 • 20
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs

Paper • 2412.08347 • Published Dec 11, 2024 • 4
FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 13