Files changed (1) hide show
  1. README.md +11 -15
README.md CHANGED
@@ -13,15 +13,17 @@ metrics:
13
 
14
  # 🥳 Platypus-30B has arrived!
15
 
16
- Platypus-30B is an instruction fine-tuned model based on the LLaMA-30B transformer architecture and takes advantage of LoRA.
17
 
18
  | Metric | Value |
19
  |-----------------------|-------|
20
- | MMLU (5-shot) | 65.4 |
21
  | ARC (25-shot) | 64.6 |
22
  | HellaSwag (10-shot) | 84.3 |
23
  | TruthfulQA (0-shot) | 45.8 |
24
- | Avg. | 65 |
 
 
25
 
26
  ## Model Details
27
 
@@ -58,17 +60,11 @@ The base LLaMA model is trained on various data, some of which may contain offen
58
  journal={arXiv preprint arXiv:2302.13971},
59
  year={2023}
60
  }
61
- @article{DBLP:journals/corr/abs-2106-09685,
62
- author = {Edward J. Hu and
63
- Yelong Shen and
64
- Phillip Wallis and
65
- Zeyuan Allen{-}Zhu and
66
- Yuanzhi Li and
67
- Shean Wang and
68
- Weizhu Chen},
69
- title = {LoRA: Low-Rank Adaptation of Large Language Models},
70
- journal = {CoRR},
71
- year = {2021},
72
- url = {https://arxiv.org/abs/2106.09685},
73
  }
74
  ```
 
13
 
14
  # 🥳 Platypus-30B has arrived!
15
 
16
+ Platypus-30B is an instruction fine-tuned model based on the LLaMA-30B transformer architecture.
17
 
18
  | Metric | Value |
19
  |-----------------------|-------|
20
+ | MMLU (5-shot) | 64.2 |
21
  | ARC (25-shot) | 64.6 |
22
  | HellaSwag (10-shot) | 84.3 |
23
  | TruthfulQA (0-shot) | 45.8 |
24
+ | Avg. | 64.7 |
25
+
26
+ We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above.
27
 
28
  ## Model Details
29
 
 
60
  journal={arXiv preprint arXiv:2302.13971},
61
  year={2023}
62
  }
63
+
64
+ @article{hu2021lora,
65
+ title={LoRA: Low-Rank Adaptation of Large Language Models},
66
+ author={Hu, Edward J. and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Chen, Weizhu},
67
+ journal={CoRR},
68
+ year={2021}
 
 
 
 
 
 
69
  }
70
  ```