-
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer
Paper • 2403.10301 • Published • 51 -
Recurrent Drafter for Fast Speculative Decoding in Large Language Models
Paper • 2403.09919 • Published • 20 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 67 -
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
Paper • 2403.09704 • Published • 31
Collections
Discover the best community collections!
Collections including paper arxiv:2403.11901
-
Larimar: Large Language Models with Episodic Memory Control
Paper • 2403.11901 • Published • 32 -
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
Paper • 2408.06292 • Published • 115 -
Imagine yourself: Tuning-Free Personalized Image Generation
Paper • 2409.13346 • Published • 67
-
Larimar: Large Language Models with Episodic Memory Control
Paper • 2403.11901 • Published • 32 -
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Paper • 2212.05055 • Published • 5 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 104 -
Multi-Head Mixture-of-Experts
Paper • 2404.15045 • Published • 59
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 54 -
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 24 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 67 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 72
-
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Paper • 2403.03853 • Published • 62 -
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Paper • 2402.09025 • Published • 6 -
Shortened LLaMA: A Simple Depth Pruning for Large Language Models
Paper • 2402.02834 • Published • 14 -
Algorithmic progress in language models
Paper • 2403.05812 • Published • 18
-
AtP*: An efficient and scalable method for localizing LLM behaviour to components
Paper • 2403.00745 • Published • 11 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 602 -
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Paper • 2402.16840 • Published • 23 -
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 111
-
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 111 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 72 -
Larimar: Large Language Models with Episodic Memory Control
Paper • 2403.11901 • Published • 32 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 50
-
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 74 -
REST: Retrieval-Based Speculative Decoding
Paper • 2311.08252 • Published -
Active Retrieval Augmented Generation
Paper • 2305.06983 • Published • 3 -
Retrieval-Augmented Generation for Large Language Models: A Survey
Paper • 2312.10997 • Published • 10