Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding Paper • 2401.07851 • Published Jan 15, 2024 • 1
AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction Paper • 2410.19743 • Published Oct 10, 2024 • 1
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration Paper • 2410.06916 • Published Oct 9, 2024
Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation Paper • 2203.16487 • Published Mar 30, 2022
Can Large Multimodal Models Uncover Deep Semantics Behind Images? Paper • 2402.11281 • Published Feb 17, 2024
Enhancing Tool Retrieval with Iterative Feedback from Large Language Models Paper • 2406.17465 • Published Jun 25, 2024