--- title: README emoji: 🐦 colorFrom: pink colorTo: indigo sdk: static pinned: false --- Hi, I am a magpie 🐦! πŸ•ΈοΈ **Project Website**: [https://magpie-align.github.io/](https://magpie-align.github.io/) πŸ“„ **Technical Report**: [https://arxiv.org/abs/2406.08464](https://arxiv.org/abs/2406.08464) πŸ€— **HF Paper Page**: [https://huggingface.co/papers/2406.08464](https://huggingface.co/papers/2406.08464) 😬 **Codes**: [https://github.com/magpie-align/magpie](https://github.com/magpie-align/magpie) You can try the Magpie demo [πŸ€— here](https://huggingface.co/spaces/davanstrien/magpie) to generate instruction-response pairs. Thanks a lot for the quick implementation from @davanstrien! ## Dataset Navigation 🧭 ### [**Meta Llama 3**](https://huggingface.co/collections/meta-llama/meta-llama-3-66214712577ca38149ebb2b6) |Model Name | Dataset | Type | Description | |-------------|:-------|:-------|:-------| | [Llama 3 70B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) | [Magpie-Pro-1M](https://huggingface.co/datasets/Magpie-Align/Llama-3-Magpie-Pro-1M-v0.1) | SFT | 1M Raw conversations built with Meta Llama 3 70B. | [Llama 3 70B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) | [Magpie-Pro-300K-Filtered](https://huggingface.co/datasets/Magpie-Align/Magpie-Pro-300K-Filtered) | SFT | Apply a filter and select 300K high quality conversations. | [Llama 3 70B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) | [Magpie-Pro-MT-300K](https://huggingface.co/datasets/Magpie-Align/Magpie-Pro-MT-300K-v0.1) | SFT | Select 300K difficult questions and extend to multi-turn conversations. | [Llama 3 8B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | [Magpie-Air-3M](https://huggingface.co/datasets/Magpie-Align/Llama-3-Magpie-Air-3M-v0.1) | SFT | 3M Raw conversations built with Meta Llama 3 8B. | [Llama 3 8B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | [Magpie-Air-300K-Filtered](https://huggingface.co/datasets/Magpie-Align/Magpie-Air-300K-Filtered) | SFT | Apply a filter and select 300K high quality data. | [Llama 3 8B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | [Magpie-Air-MT-300K](https://huggingface.co/datasets/Magpie-Align/Magpie-Air-MT-300K-v0.1) | SFT | Select 300K difficult questions and extend to multi-turn conversations. ### [**Qwen2**](https://huggingface.co/collections/Qwen/qwen2-6659360b33528ced941e557f) |Model Name | Dataset | Type | Description | |-------------|:-------|:-------|:-------| | [Qwen2 72B Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct) | [Magpie-Qwen2-Pro-1M](https://huggingface.co/datasets/Magpie-Align/Magpie-Qwen2-Pro-1M-v0.1) | SFT | 1M Raw conversations built with Qwen2 72B Instruct. | [Qwen2 72B Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct) | [Magpie-Qwen2-Pro-300K-Filtered](https://huggingface.co/datasets/Magpie-Align/Magpie-Qwen2-Pro-300K-Filtered) | SFT | Apply a filter and select 300K high quality conversations. | [Qwen2 72B Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct) | [Magpie-Qwen2-Pro-200K-Chinese](https://huggingface.co/datasets/Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese) | SFT | Apply a filter and select 200K high quality Chinese conversations. | [Qwen2 7B Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) | [Magpie-Qwen2-Air-3M](https://huggingface.co/datasets/Magpie-Align/Magpie-Qwen2-Air-3M-v0.1) | SFT | 3M Raw conversations built with Qwen2 7B Instruct. | [Qwen2 7B Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) | [Magpie-Qwen2-Air-300K-Filtered](https://huggingface.co/datasets/Magpie-Align/Magpie-Qwen-Air-300K-Filtered) | SFT | Apply a filter and select 300K high quality conversations. ### [**Phi-3**](https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3) |Model Name | Dataset | Type | Description | |-------------|:-------|:-------|:-------| | [Phi-3 Medium Instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct) | [Magpie-Phi3-Pro-1M](https://huggingface.co/datasets/Magpie-Align/Magpie-Phi3-Pro-1M-v0.1) | SFT | 1M Raw conversations built with Phi-3 Medium Instruct. | [Phi-3 Medium Instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct) | [Magpie-Phi3-Pro-300K-Filtered](https://huggingface.co/datasets/Magpie-Align/Magpie-Phi3-Pro-300K-Filtered) | SFT | Apply a filter and select 300K high quality conversations.