-
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations
Paper • 2412.13171 • Published • 30 -
o1-Coder: an o1 Replication for Coding
Paper • 2412.00154 • Published • 40 -
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability
Paper • 2411.19943 • Published • 55 -
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper • 2412.01928 • Published • 39
Collections
Discover the best community collections!
Collections including paper arxiv:2412.01928
-
Video Creation by Demonstration
Paper • 2412.09551 • Published • 8 -
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Paper • 2412.07589 • Published • 45 -
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Paper • 2412.06531 • Published • 71 -
APOLLO: SGD-like Memory, AdamW-level Performance
Paper • 2412.05270 • Published • 38
-
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 61 -
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding
Paper • 2411.04282 • Published • 30 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 29 -
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper • 2412.01928 • Published • 39
-
Rethinking Data Selection at Scale: Random Selection is Almost All You Need
Paper • 2410.09335 • Published • 16 -
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning
Paper • 2410.06456 • Published • 35 -
Emergent properties with repeated examples
Paper • 2410.07041 • Published • 8 -
Personalized Visual Instruction Tuning
Paper • 2410.07113 • Published • 69
-
LLMs + Persona-Plug = Personalized LLMs
Paper • 2409.11901 • Published • 31 -
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Paper • 2409.12183 • Published • 36 -
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Paper • 2402.12875 • Published • 13 -
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices
Paper • 2410.00531 • Published • 29
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 14 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 24 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 29 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 6
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 21 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 82 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 145 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Paper • 2305.14325 • Published • 1 -
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Paper • 2308.10848 • Published • 1 -
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper • 2412.01928 • Published • 39