nectec
/

Pathumma-llm-audio-1.0.0

Text Generation

feature-extraction

Model card Files Files and versions Community

PATTARA TIPAKSORN commited on 24 days ago

Commit

4b56629

•

1 Parent(s): 0656ce9

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -63,7 +63,12 @@ with torch.no_grad():
 print(response[0])
 ```
 ## Evaluation Performance
-Additional details are required.
 ## Limitations and Future Work
 At present, our model remains in the experimental research phase and is not yet fully suitable for practical applications as an assistant. Future work will focus on upgrading the language model to a newer version [Pathumma-llm-text-1.0.0](https://huggingface.co/nectec/Pathumma-llm-text-1.0.0), and curating more refined and robust datasets to improve performance. Additionally, we aim to address and prioritize the safety and reliability of the model's outputs.

 print(response[0])
 ```
 ## Evaluation Performance
+| Model                        |  ASR-th CV18 th (WER↓)   | ASR-en CV18 En (WER↓)    |   ASR-en Librispeech En (WER↓) | ThaiSER Emotion (Acc↑, F1↑)|  ThaiSER Gender (Acc↑, F1↑)  |
+|:----------------------------:|:------------------------:|:------------------------:|:------------------------------:|:------------------:|:--------------------:|
+| Typhoon-Audio-Preview        | 13.26                    | 13.34 (partial result)   | 5.07 (partial result)          |    41.50, 33.48    |       96.20, 96.69   |
+| DIVA                         | 69.15 (partial result)   | 37.40                    | 49.06                          |    18.64, 8.16     |       47.50, 35.90   |
+| Gemini-1.5-Pro               | 16.49                    | 12.94                    | 25.83                          |    26.00, 18.26    |       79.66, 77.32   |
+| Pathumma-llm-audio-1.0.0     | 12.03                    | 12.20                    | 11.36                          |    42.30, 36.88    |       90.30, 92.07   |
 ## Limitations and Future Work
 At present, our model remains in the experimental research phase and is not yet fully suitable for practical applications as an assistant. Future work will focus on upgrading the language model to a newer version [Pathumma-llm-text-1.0.0](https://huggingface.co/nectec/Pathumma-llm-text-1.0.0), and curating more refined and robust datasets to improve performance. Additionally, we aim to address and prioritize the safety and reliability of the model's outputs.