laineyyy commited on
Commit
c9757cd
1 Parent(s): 8b4fabc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -54,10 +54,10 @@ We relied on the popular MTBench benchmark to evaluate multi-turn performance.
54
 
55
  Since MTBench is an English only benchmark, we also release this fork of [MTBench Finnish](https://github.com/LumiOpen/FastChat/tree/main/fastchat/llm_judge) with multilingual support and machine translated Finnish prompts. Our scores for both benchmarks follow.
56
 
57
- | Eval | Score |
58
- | :-------------- | :----: |
59
- | MTBench | 5.93 |
60
- | MTBench Finnish | 5.90 |
61
 
62
 
63
  ## License
 
54
 
55
  Since MTBench is an English only benchmark, we also release this fork of [MTBench Finnish](https://github.com/LumiOpen/FastChat/tree/main/fastchat/llm_judge) with multilingual support and machine translated Finnish prompts. Our scores for both benchmarks follow.
56
 
57
+ | Eval | Overall | Coding | Extraction | Humanities | Math | Reasoning | Roleplay | STEM | Writing |
58
+ | :---- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | ----: |
59
+ | MTBench English | 6.16 | 3.65 | 6.55 | 9.6 | 2.25 | 4.25 | 7.25 | 7.42 | 8.37 |
60
+ | MTBench Finnish | 5.73 | 3.05 | 6.05 | 9.6 | 1.25 | 3.65 | 7.0 | 7.65 | 7.6 |
61
 
62
 
63
  ## License