openthaigpt
/

openthaigpt1.5-14b-instruct

@@ -32,27 +32,26 @@ https://github.com/OpenThaiGPT/openthaigpt1.5_api_examples
 ## Benchmark on [OpenThaiGPT Eval](https://huggingface.co/datasets/openthaigpt/openthaigpt_eval)
 ** Please take a look at ``openthaigpt/openthaigpt1.5-14b-instruct`` for this model's evaluation result.
-| **Exam names**                 | **openthaigpt/openthaigpt1.5-7b** | **openthaigpt/openthaigpt1.5-14b** | **openthaigpt/openthaigpt1.5-72b** |
-|--------------------------------|-----------------------------------|------------------------------------|------------------------------------|
-| **01_a_level**                 | 60.00%                            | 65.00%                             | 76.67%                             |
-| **02_tgat**                    | 36.00%                            | 50.00%                             | 46.00%                             |
-| **03_tpat1**                   | 57.50%                            | 52.50%                             | 55.00%                             |
-| **04_investment_consult**      | 76.00%                            | 72.00%                             | 72.00%                             |
-| **05_facebook_beleble_th_200** | 81.00%                            | 87.00%                             | 90.00%                             |
-| **06_xcopa_th_200**            | 81.00%                            | 86.50%                             | 90.50%                             |
-| **07_xnli2.0_th_200**          | 54.50%                            | 64.50%                             | 70.50%                             |
-| **08_onet_m3_thai**            | 64.00%                            | 84.00%                             | 84.00%                             |
-| **09_onet_m3_social**          | 80.00%                            | 90.00%                             | 95.00%                             |
-| **10_onet_m3_math**            | 31.25%                            | 12.50%                             | 37.50%                             |
-| **11_onet_m3_science**         | 46.15%                            | 53.85%                             | 73.08%                             |
-| **12_onet_m3_english**         | 83.33%                            | 93.33%                             | 96.67%                             |
-| **13_onet_m6_thai**            | 53.85%                            | 56.92%                             | 56.92%                             |
-| **14_onet_m6_math**            | 29.41%                            | 41.18%                             | 41.18%                             |
-| **15_onet_m6_social**          | 58.18%                            | 61.82%                             | 65.45%                             |
-| **16_onet_m6_science**         | 57.14%                            | 57.14%                             | 67.86%                             |
-| **17_onet_m6_english**         | 80.77%                            | 78.85%                             | 90.38%                             |
-| **Micro Average**              | 65.78%                            | <b style="color:blue">71.51%</b>                             | 76.73%                             |
 Thai language multiple choice exams, Test on unseen test set, Zero-shot learning. Benchmark source code and exams information: https://github.com/OpenThaiGPT/openthaigpt_eval

 ## Benchmark on [OpenThaiGPT Eval](https://huggingface.co/datasets/openthaigpt/openthaigpt_eval)
 ** Please take a look at ``openthaigpt/openthaigpt1.5-14b-instruct`` for this model's evaluation result.
+| **Exam names**                 | **scb10x/llama-3-typhoon-v1.5x-70b-instruct** | **Qwen/Qwen2.5-14B-Instruct** | **openthaigpt/openthaigpt1.5-14b** | **openthaigpt/openthaigpt1.5-72b** |
+|--------------------------------|-----------------------------------------------|-------------------------------|------------------------------------|------------------------------------|
+| **01_a_level**                 | 59.17%                                        | 61.67%                        | 65.00%                             | 76.67%                             |
+| **02_tgat**                    | 46.00%                                        | 44.00%                        | 50.00%                             | 46.00%                             |
+| **03_tpat1**                   | 52.50%                                        | 60.00%                        | 52.50%                             | 55.00%                             |
+| **04_investment_consult**      | 60.00%                                        | 76.00%                        | 72.00%                             | 72.00%                             |
+| **05_facebook_beleble_th_200** | 87.50%                                        | 84.50%                        | 87.00%                             | 90.00%                             |
+| **06_xcopa_th_200**            | 84.50%                                        | 85.00%                        | 86.50%                             | 90.50%                             |
+| **07_xnli2.0_th_200**          | 62.50%                                        | 69.50%                        | 64.50%                             | 70.50%                             |
+| **08_onet_m3_thai**            | 76.00%                                        | 76.00%                        | 84.00%                             | 84.00%                             |
+| **09_onet_m3_social**          | 95.00%                                        | 90.00%                        | 90.00%                             | 95.00%                             |
+| **10_onet_m3_math**            | 43.75%                                        | 43.75%                        | 12.50%                             | 37.50%                             |
+| **11_onet_m3_science**         | 53.85%                                        | 50.00%                        | 53.85%                             | 73.08%                             |
+| **12_onet_m3_english**         | 93.33%                                        | 93.33%                        | 93.33%                             | 96.67%                             |
+| **13_onet_m6_thai**            | 55.38%                                        | 52.31%                        | 56.92%                             | 56.92%                             |
+| **14_onet_m6_math**            | 41.18%                                        | 23.53%                        | 41.18%                             | 41.18%                             |
+| **15_onet_m6_social**          | 67.27%                                        | 60.00%                        | 61.82%                             | 65.45%                             |
+| **16_onet_m6_science**         | 50.00%                                        | 50.00%                        | 57.14%                             | 67.86%                             |
+| **17_onet_m6_english**         | 73.08%                                        | 82.69%                        | 78.85%                             | 90.38%                             |
+| **Micro Average**              | 69.97%                                        | 71.00%                        | <b style="color:blue">71.51</b>                             | 76.73%                             |
 Thai language multiple choice exams, Test on unseen test set, Zero-shot learning. Benchmark source code and exams information: https://github.com/OpenThaiGPT/openthaigpt_eval