EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models Paper • 2310.03270 • Published Oct 5, 2023
Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM Paper • 2310.04836 • Published Oct 7, 2023 • 1
Data-Free Quantization with Accurate Activation Clipping and Adaptive Batch Normalization Paper • 2204.04215 • Published Apr 8, 2022 • 1
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models Paper • 2405.14366 • Published May 23, 2024 • 1
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression Paper • 2410.08584 • Published Oct 11, 2024 • 12
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression Paper • 2410.08584 • Published Oct 11, 2024 • 12
InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation Paper • 2407.10061 • Published Jul 14, 2024
KMM: Key Frame Mask Mamba for Extended Motion Generation Paper • 2411.06481 • Published Nov 10, 2024 • 4
ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality Paper • 2412.04062 • Published Dec 5, 2024 • 7
ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality Paper • 2412.04062 • Published Dec 5, 2024 • 7
GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation Paper • 2411.18499 • Published Nov 27, 2024 • 18
JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation Paper • 2411.07975 • Published Nov 12, 2024 • 27
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation Paper • 2410.13848 • Published Oct 17, 2024 • 32
LongVLM: Efficient Long Video Understanding via Large Language Models Paper • 2404.03384 • Published Apr 4, 2024
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models Paper • 2405.14366 • Published May 23, 2024 • 1
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI Paper • 2408.03361 • Published Aug 6, 2024 • 86
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models Paper • 2310.08041 • Published Oct 12, 2023 • 1
Mesa: A Memory-saving Training Framework for Transformers Paper • 2111.11124 • Published Nov 22, 2021 • 1
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models Paper • 2311.16503 • Published Nov 27, 2023