"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published Nov 4, 2024 • 47
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published Nov 25, 2024 • 42
Learning to Move Like Professional Counter-Strike Players Paper • 2408.13934 • Published Aug 25, 2024 • 23
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 124
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 58
Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model Paper • 2408.00754 • Published Aug 1, 2024 • 22
Compact Language Models via Pruning and Knowledge Distillation Paper • 2407.14679 • Published Jul 19, 2024 • 39
Knowledge Mechanisms in Large Language Models: A Survey and Perspective Paper • 2407.15017 • Published Jul 22, 2024 • 34
OutfitAnyone: Ultra-high Quality Virtual Try-On for Any Clothing and Any Person Paper • 2407.16224 • Published Jul 23, 2024 • 27
DDK: Distilling Domain Knowledge for Efficient Large Language Models Paper • 2407.16154 • Published Jul 23, 2024 • 22
E5-V: Universal Embeddings with Multimodal Large Language Models Paper • 2407.12580 • Published Jul 17, 2024 • 40
Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning Paper • 2407.10718 • Published Jul 15, 2024 • 18
Human-like Episodic Memory for Infinite Context LLMs Paper • 2407.09450 • Published Jul 12, 2024 • 60
Inference Performance Optimization for Large Language Models on CPUs Paper • 2407.07304 • Published Jul 10, 2024 • 52