Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2312.06585

Synthetic Data Generation

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 142
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 87
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Paper • 2305.07759 • Published May 12, 2023 • 33
Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28 • 94

Synthetic Data Generation

A curated list of papers focusing on synthetic data generation

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 87
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16 • 29
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 28

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 143
ReFT: Reasoning with Reinforced Fine-Tuning

Paper • 2401.08967 • Published Jan 17 • 27
Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16 • 20
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10 • 64

LLM Self-training

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 28

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 28
TinyGSM: achieving >80% on GSM8k with small language models

Paper • 2312.09241 • Published Dec 14, 2023 • 37
SciPhi/AgentSearch-V1

Viewer • Updated Jan 14 • 70k • 2.48k • 85
Data Filtering Networks

Paper • 2309.17425 • Published Sep 29, 2023 • 6

LLMs SELF-IMPROVEMENT

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 28
Enable Language Models to Implicitly Learn Self-Improvement From Data

Paper • 2310.00898 • Published Oct 2, 2023 • 23
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Paper • 2312.10003 • Published Dec 15, 2023 • 35
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 64

cool foundation ml

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 28

Papers of interest

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 28

Candidate papers to read in the H4 journal club

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs

Paper • 2210.14986 • Published Oct 26, 2022 • 5
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2

Paper • 2311.10702 • Published Nov 17, 2023 • 18
Large Language Models as Optimizers

Paper • 2309.03409 • Published Sep 7, 2023 • 75
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting

Paper • 2309.04269 • Published Sep 8, 2023 • 32

Research Papers

A collection of papers focused on LLM

Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 70
ToolTalk: Evaluating Tool-Usage in a Conversational Setting

Paper • 2311.10775 • Published Nov 15, 2023 • 7
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning

Paper • 2311.11077 • Published Nov 18, 2023 • 24
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning

Paper • 2311.11501 • Published Nov 20, 2023 • 33

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs