leaderboard-pr-bot's picture
Adding Evaluation Results
aa56a0b
|
raw
history blame
1.75 kB
metadata
license: llama2
datasets:
  - garage-bAInd/Open-Platypus
language:
  - en

image/png

Buy Me A Coffee

Nous-Hermes-Platypus2-13B-QLoRA-0.80-epoch

Nous-Hermes-Platypus2-13B-QLoRA-0.80-epoch is a merge of NousResearch/Nous-Hermes-Llama2-13b and Platypus2-13B-QLoRA-0.80-epoch

Evaluation Results (Open LLM Leaderboard)

Metric Value
Avg. 62.74
ARC (25-shot) 59.9
HellaSwag (10-shot) 83.29
MMLU (5-shot) 56.69
TruthfulQA (0-shot) 51.08

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 52.89
ARC (25-shot) 59.9
HellaSwag (10-shot) 83.29
MMLU (5-shot) 56.69
TruthfulQA (0-shot) 51.08
Winogrande (5-shot) 75.22
GSM8K (5-shot) 1.44
DROP (3-shot) 42.65