youngko
's Collections
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper
•
2401.17464
•
Published
•
18
Transforming and Combining Rewards for Aligning Large Language Models
Paper
•
2402.00742
•
Published
•
12
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
Language Models
Paper
•
2402.03300
•
Published
•
80
Specialized Language Models with Cheap Inference from Limited Domain
Data
Paper
•
2402.01093
•
Published
•
46
Learning Universal Predictors
Paper
•
2401.14953
•
Published
•
20
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper
•
2402.03620
•
Published
•
115
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning
Tasks
Paper
•
2402.04248
•
Published
•
31
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
Paper
•
2402.05140
•
Published
•
22
InternLM-Math: Open Math Large Language Models Toward Verifiable
Reasoning
Paper
•
2402.06332
•
Published
•
19
Real-World Fluid Directed Rigid Body Control via Deep Reinforcement
Learning
Paper
•
2402.06102
•
Published
•
5
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Paper
•
2402.07043
•
Published
•
14
Premise Order Matters in Reasoning with Large Language Models
Paper
•
2402.08939
•
Published
•
28
Chain-of-Thought Reasoning Without Prompting
Paper
•
2402.10200
•
Published
•
105
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Paper
•
2305.07185
•
Published
•
9
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
•
2402.13753
•
Published
•
115
Coercing LLMs to do and reveal (almost) anything
Paper
•
2402.14020
•
Published
•
13
Watermarking Makes Language Models Radioactive
Paper
•
2402.14904
•
Published
•
23
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
608
AtP*: An efficient and scalable method for localizing LLM behaviour to
components
Paper
•
2403.00745
•
Published
•
13
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Paper
•
2403.01487
•
Published
•
15
The Unreasonable Ineffectiveness of the Deeper Layers
Paper
•
2403.17887
•
Published
•
79
Can large language models explore in-context?
Paper
•
2403.15371
•
Published
•
32
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Paper
•
2403.15042
•
Published
•
26
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual
Math Problems?
Paper
•
2403.14624
•
Published
•
52
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
•
2403.10704
•
Published
•
58
DiPaCo: Distributed Path Composition
Paper
•
2403.10616
•
Published
•
13
RAFT: Adapting Language Model to Domain Specific RAG
Paper
•
2403.10131
•
Published
•
69