Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
Abstract
Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision. One limitation of these approaches is that they resort to very large language models (around 175B parameters) that are also proprietary and non-public. Here we explore the application of such techniques to language models that are much smaller (around 10B--40B parameters) and have permissive licenses. We find the Self-Instruct approach to be less effective at these sizes and propose new ICL methods that draw on two main ideas: (a) Categorization and simplification of the ICL templates to make prompt learning easier for the LM, and (b) Ensembling over multiple LM outputs to help select high-quality synthetic examples. Our algorithm leverages the 175 Self-Instruct seed tasks and employs separate pipelines for instructions that require an input and instructions that do not. Empirical investigations with different LMs show that: (1) Our proposed method yields higher-quality instruction tuning data than Self-Instruct, (2) It improves performances of both vanilla and instruction-tuned LMs by significant margins, and (3) Smaller instruction-tuned LMs generate more useful outputs than their larger un-tuned counterparts. Our codebase is available at https://github.com/IBM/ensemble-instruct.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Text Data Augmentation in Low-Resource Settings via Fine-Tuning of Large Language Models (2023)
- ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer (2023)
- Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models (2023)
- PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation (2023)
- Tuna: Instruction Tuning using Feedback from Large Language Models (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Text Data Augmentation in Low-Resource Settings via Fine-Tuning of Large Language Models (2023)
- ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer (2023)
- Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models (2023)
- PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation (2023)
- Tuna: Instruction Tuning using Feedback from Large Language Models (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper