Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.01928

inference time compute

about 14 hours ago

Compressed Chain of Thought: Efficient Reasoning Through Dense Representations

Paper • 2412.13171 • Published 7 days ago • 30
o1-Coder: an o1 Replication for Coding

Paper • 2412.00154 • Published 26 days ago • 40
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability

Paper • 2411.19943 • Published 25 days ago • 55
MALT: Improving Reasoning with Multi-Agent LLM Training

Paper • 2412.01928 • Published 22 days ago • 39

Pending Classification

Video Creation by Demonstration

Paper • 2412.09551 • Published 12 days ago • 8
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published 15 days ago • 45
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Paper • 2412.06531 • Published 16 days ago • 71
APOLLO: SGD-like Memory, AdamW-level Performance

Paper • 2412.05270 • Published 18 days ago • 38

RL Fine-tuning Reasoning

A Collection of Papers on Using Reinforcement Learning to Enhance Reasoning

about 14 hours ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 15 days ago • 61
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6 • 30
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published 4 days ago • 29
MALT: Improving Reasoning with Multi-Agent LLM Training

Paper • 2412.01928 • Published 22 days ago • 39

Rethinking Data Selection at Scale: Random Selection is Almost All You Need

Paper • 2410.09335 • Published Oct 12 • 16
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

Paper • 2410.06456 • Published Oct 9 • 35
Emergent properties with repeated examples

Paper • 2410.07041 • Published Oct 9 • 8
Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9 • 69

about 15 hours ago

LLMs + Persona-Plug = Personalized LLMs

Paper • 2409.11901 • Published Sep 18 • 31
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published Sep 18 • 36
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

Paper • 2402.12875 • Published Feb 20 • 13
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices

Paper • 2410.00531 • Published Oct 1 • 29

about 5 hours ago

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23 • 14
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4 • 24
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30 • 29
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30 • 6

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1 • 21
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 82
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 145
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30 • 25

Improving Factuality and Reasoning in Language Models through Multiagent Debate

Paper • 2305.14325 • Published May 23, 2023 • 1
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

Paper • 2308.10848 • Published Aug 21, 2023 • 1
MALT: Improving Reasoning with Multi-Agent LLM Training

Paper • 2412.01928 • Published 22 days ago • 39

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs