Marx-3B-V2 / README.md
acrastt's picture
Adding Evaluation Results
d918282
|
raw
history blame
1.51 kB
metadata
license: apache-2.0
datasets:
  - totally-not-an-llm/EverythingLM-data-V2-sharegpt
language:
  - en
library_name: transformers

Buy Me A Coffee

This is OpenLLaMA 3B V2 finetuned on EverythingLM Data V2(ShareGPT format) for 2 epochs.

Prompt template:

### HUMAN:
{prompt}

### RESPONSE:
<leave a newline for the model to answer>

q4_1 GGML quant available here.
q4_1 GGUF quant available here.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 36.89
ARC (25-shot) 44.03
HellaSwag (10-shot) 72.92
MMLU (5-shot) 27.84
TruthfulQA (0-shot) 39.92
Winogrande (5-shot) 66.54
GSM8K (5-shot) 1.21
DROP (3-shot) 5.8