allenai
/

OLMoE-1B-7B-0924-Instruct

@@ -22,50 +22,6 @@ base_model: allenai/OLMoE-1B-7B-0924-SFT
 - Paper:
 - Logs: https://github.com/allenai/OLMoE/blob/main/logs/olmoe-dpo-logs.txt
-### Evaluation Summary
-| Task (→)      | MMLU | GSM8k | BBH  | Human-Eval | Alpaca-Eval 1.0 | XSTest | IFEval | Avg  |
-|---------------|------|-------|------|------------|-----------------|--------|--------|------|
-| **Setup (→)**     | 0-shot | 8-shot CoT | 3-shot | 0-shot | 0-shot | 0-shot | 0-shot |      |
-| **Metric (→)**    | EM   | EM    | EM   | Pass@10    | %win            | F1     | Loose Acc |      |
-|  |     |      |     |      |              |       |   |      |
-| OLMo-1B (0724) | 25.0 | 7.0   | 22.5 | 16.0       | -               | 67.6   | 20.5   | -    |
-| +SFT          | 36.0 | 12.5  | 27.2 | 21.2       | 41.5            | 81.9   | 26.1   | 35.9 |
-| +DPO          | 36.7 | 12.5  | 30.6 | 22.0       | 50.9            | 79.8   | 24.2   | 37.4 |
-| OLMo-7B (0724) | 50.8 | 32.5  | 36.9 | 32.3       | -               | 80.8   | 19.6   | -    |
-| +SFT          | 54.2 | 25.0  | 35.7 | 38.5       | 70.9            | 86.1   | 39.7   | 49.3 |
-| +DPO          | 52.8 | 9.0   | 16.6 | 35.0       | 83.5            | **87.5** | 37.9   | 49.1 |
-| JetMoE-2B-9B  | 45.6 | 43.0  | 37.2 | 54.6       | -               | 68.2   | 20.0   | -    |
-| +SFT          | 46.1 | 53.5  | 35.6 | 64.8       | 69.3            | 55.6   | 30.5   | 50.4 |
-| DeepSeek-3B-16B | 37.7 | 18.5  | 39.4 | 48.3       | -               | 65.9   | 13.5   | -    |
-| +Chat         | 48.5 | 46.5  | **40.8** | **70.1** | 74.8            | 85.6   | 32.3   | 57.0 |
-| Qwen1.5-3B-14B | **60.4** | 13.5  | 27.2 | 60.2       | -               | 73.4   | 20.9   | -    |
-| +Chat         | 58.9 | **55.5** | 21.3 | 59.7       | 83.9            | 85.6   | 36.2   | 57.3 |
-| **OLMoE (This Model)**      | 49.8 | 3.0   | 33.6 | 22.4       | -               | 59.7   | 16.6   | -    |
-| **+SFT**      | 51.4 | 40.5  | 38.0 | 51.6       | 69.2            | 84.1   | 43.3   | 54.0 |
-| **+DPO**      | 51.9 | 45.5  | 37.0 | 54.8       | **84.0**         | 82.6   | **48.1** | **57.7** |
-### Artifacts
-- **Pretraining**
-  - [Checkpoints](https://hf.co/allenai/OLMoE-1B-7B-0924)
-  - [Code](https://github.com/allenai/OLMo/tree/Muennighoff/MoE): Built on top of OLMo models.
-  - [Data](https://huggingface.co/datasets/allenai/OLMoE-mix-0924): Mix of DCLM Baseline with some components of Dolma.
-  - Logs: *coming soon*
-- **SFT (Supervised Fine-Tuning)**
-  - [Checkpoints](https://huggingface.co/allenai/OLMoE-1B-7B-0924-SFT): With and without load balancing.
-  - [Code](https://github.com/allenai/open-instruct/tree/olmoe-sft)
-  - [Data](https://hf.co/datasets/allenai/tulu-v3.1-mix-preview-4096-OLMoE): Preview of Tulu 3 post-training recipe.
-  - [Logs](https://github.com/allenai/OLMoE/blob/main/logs/olmoe-sft-logs.txt)
-- **DPO/KTO (Direct Preference Optimization/Kahneman-Tversky Optimization)**
-  - [Checkpoints](https://huggingface.co/allenai/OLMoE-1B-7B-0924-Instruct)
-  - [Preference Data](https://hf.co/datasets/allenai/ultrafeedback_binarized_cleaned)
-  - [DPO code](https://github.com/allenai/open-instruct/tree/olmoe-sft), [KTO code](https://github.com/Muennighoff/kto/blob/master/kto.py)
-  - [Logs](https://github.com/allenai/OLMoE/blob/main/logs/olmoe-dpo-logs.txt)
 # Use
 Install `transformers` **from source** until a release after [this PR](https://github.com/huggingface/transformers/pull/32406) & `torch` and run:
@@ -99,6 +55,29 @@ Branches:
 - `non-annealed`: Ablation starting from the `non-annealed` branch of https://hf.co/allenai/OLMoE-1B-7B-0924-SFT which is an SFT of the pretraining checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/allenai/OLMoE-1B-7B-0924)
 - `kto`: Ablation using KTO instead of DPO. This branch is the checkpoint after 5,000 steps with the RMS optimizer. The other `kto*` branches correspond to the other checkpoints mentioned in the paper.
 # Citation
 ```bibtex

 - Paper:
 - Logs: https://github.com/allenai/OLMoE/blob/main/logs/olmoe-dpo-logs.txt
 # Use
 Install `transformers` **from source** until a release after [this PR](https://github.com/huggingface/transformers/pull/32406) & `torch` and run:
 - `non-annealed`: Ablation starting from the `non-annealed` branch of https://hf.co/allenai/OLMoE-1B-7B-0924-SFT which is an SFT of the pretraining checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/allenai/OLMoE-1B-7B-0924)
 - `kto`: Ablation using KTO instead of DPO. This branch is the checkpoint after 5,000 steps with the RMS optimizer. The other `kto*` branches correspond to the other checkpoints mentioned in the paper.
+# Evaluation Snapshot
+| Task (→)      | MMLU | GSM8k | BBH  | Human-Eval | Alpaca-Eval 1.0 | XSTest | IFEval | Avg  |
+|---------------|------|-------|------|------------|-----------------|--------|--------|------|
+| **Setup (→)**     | 0-shot | 8-shot CoT | 3-shot | 0-shot | 0-shot | 0-shot | 0-shot |      |
+| **Metric (→)**    | EM   | EM    | EM   | Pass@10    | %win            | F1     | Loose Acc |      |
+|  |     |      |     |      |              |       |   |      |
+| OLMo-1B (0724) | 25.0 | 7.0   | 22.5 | 16.0       | -               | 67.6   | 20.5   | -    |
+| +SFT          | 36.0 | 12.5  | 27.2 | 21.2       | 41.5            | 81.9   | 26.1   | 35.9 |
+| +DPO          | 36.7 | 12.5  | 30.6 | 22.0       | 50.9            | 79.8   | 24.2   | 37.4 |
+| OLMo-7B (0724) | 50.8 | 32.5  | 36.9 | 32.3       | -               | 80.8   | 19.6   | -    |
+| +SFT          | 54.2 | 25.0  | 35.7 | 38.5       | 70.9            | 86.1   | 39.7   | 49.3 |
+| +DPO          | 52.8 | 9.0   | 16.6 | 35.0       | 83.5            | **87.5** | 37.9   | 49.1 |
+| JetMoE-2B-9B  | 45.6 | 43.0  | 37.2 | 54.6       | -               | 68.2   | 20.0   | -    |
+| +SFT          | 46.1 | 53.5  | 35.6 | 64.8       | 69.3            | 55.6   | 30.5   | 50.4 |
+| DeepSeek-3B-16B | 37.7 | 18.5  | 39.4 | 48.3       | -               | 65.9   | 13.5   | -    |
+| +Chat         | 48.5 | 46.5  | **40.8** | **70.1** | 74.8            | 85.6   | 32.3   | 57.0 |
+| Qwen1.5-3B-14B | **60.4** | 13.5  | 27.2 | 60.2       | -               | 73.4   | 20.9   | -    |
+| +Chat         | 58.9 | **55.5** | 21.3 | 59.7       | 83.9            | 85.6   | 36.2   | 57.3 |
+| **OLMoE (This Model)**      | 49.8 | 3.0   | 33.6 | 22.4       | -               | 59.7   | 16.6   | -    |
+| **+SFT**      | 51.4 | 40.5  | 38.0 | 51.6       | 69.2            | 84.1   | 43.3   | 54.0 |
+| **+DPO**      | 51.9 | 45.5  | 37.0 | 54.8       | **84.0**         | 82.6   | **48.1** | **57.7** |
 # Citation
 ```bibtex