MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool Paper • 2406.17565 • Published Jun 25 • 5
Octo-planner: On-device Language Model for Planner-Action Agents Paper • 2406.18082 • Published Jun 26 • 47
SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation Paper • 2406.19215 • Published Jun 27 • 29