File size: 3,932 Bytes
cf3b60f 82cd484 cf3b60f 3241ea3 b0125f1 1e4d5a8 3241ea3 5c636ee b0125f1 bdff53b ba96cfc a68ead6 f926953 a68ead6 f926953 a68ead6 71e9e1c cf3b60f f039d93 a499325 f039d93 71e9e1c 82cd484 cf3b60f 3241ea3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
---
# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
# Doc / guide: https://huggingface.co/docs/hub/model-cards
{}
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
2024.4.4 Update
This model is a sentiment analysis model designed to determine the positive/neutral/negative sentiment of sentences included in corporate-related news.
This model is targeted to provide sentiment for "important news", as described in the paper mentioned following. So, the results may not be accurate for less important news.
For the importance of a news title, please use kwoncho/ko-sroberta-multitask-informative
It can be used as a Korean-based sentiment analysis model for the finance/management/accounting fields.
Example>
"Samsung's debt is increasing." --> Neutral. The mere increase in debt is not necessarily negative.
"Due to the failure of management strategy, Samsung's debt is increasing." --> Negative. Debt increase due to failure is considered negative.
Hyun Ji-won, Lee Jun-il, and Cho Hyun-kwon. "A Study on Sentiment Classification of Corporate-related News Articles Using KoBERT." Accounting Research 47.4 (2022): 33-54.
We have further developed the model proposed in the above paper and made it available through Huggingface. If you use it for research purposes, please cite the above paper.
This model was fine-tuned using https://huggingface.co/jhgan/ko-sroberta-multitask.
For the usage code, refer to the link below:
Google Colab: https://colab.research.google.com/drive/1ORzKUr94cPyc5jaRCAngbclm4Qb4DtdG
The current evaluation results of the model are as follows:
{'eval_loss': 0.7330707907676697, 'eval_f1': 0.8689251403360293, 'eval_runtime': 0.464, 'eval_samples_per_second': 2047.32, 'eval_steps_per_second': 17.241, 'epoch': 33.33}
While the accuracy has increased compared to the paper's 85.7%, the improvement is not significant.
2024.4.4 Update
์ด ๋ชจํ์ ๊ธฐ์
๊ด๋ จ ๋ด์ค์ ํฌํจ๋ ๋ฌธ์ฅ์ ๊ธ์ /์ค๋ฆฝ/๋ถ์ ์ ํ๋จํ๊ธฐ ์ํ ๊ฐ์ฑ๋ถ์ ๋ชจํ์
๋๋ค.
์ด ๋ชจํ์ ํ๋จ ๋
ผ๋ฌธ์์ ์ค๋ช
ํ ๋ฐ์ ๊ฐ์ด ์ค์ํ ๋ด์ค์ ๊ฐ์ฑ๋ถ์๊ฒฐ๊ณผ๋ฅผ ์ ๊ณตํ๋๋ก ํ๋ จ๋์์ผ๋ฏ๋ก, ์ค์์ฑ์ด ๋ฎ์ ๋ด์ค์ ๋ํ ๊ฐ์ฑ๋ถ์ ๊ฒฐ๊ณผ๋ ์ ํํ์ง ์์ ์ ์์ต๋๋ค.
์ค์์ฑ ํ๋ณ ๋ชจ๋ธ์ ์๋ ๋งํฌ ๋ชจํ์ ์ฌ์ฉํ์๋ฉด ๋ฉ๋๋ค.
https://huggingface.co/kwoncho/ko-sroberta-multitask-informative
ํ๊ตญ์ด ๊ธฐ๋ฐ ๊ธ์ต/๊ฒฝ์/ํ๊ณ ๋ถ์ผ ๊ฐ์ฑ๋ถ์ ๋ชจํ์ผ๋ก ์ฌ์ฉํ์๋ฉด ๋ฉ๋๋ค.
์์>
์ผ์ฑ์ ์์ ๋ถ์ฑ๊ฐ ์ฆ๊ฐํ๊ณ ์์ต๋๋ค. --> ์ค๋ฆฝ (neutral). ๋ถ์ฑ์ฆ๊ฐ ์์ฒด๋ ๋ถ์ ์ ์ด๋ผ๊ณ ๋ณด๊ธฐ ์ด๋ ค์
๊ฒฝ์์ ๋ต์ ์คํจ๋ก ์ผ์ฑ์ ์์ ๋ถ์ฑ๊ฐ ์ฆ๊ฐํ๊ณ ์์ต๋๋ค. --> ๋ถ์ (negative). ์คํจ๋ก ์ธํ ๋ถ์ฑ ์ฆ๊ฐ๋ ๋ถ์ ์
ํ์ง์, ์ด์ค์ผ, and ์กฐํ๊ถ. "KoBERT ๋ฅผ ์ด์ฉํ ๊ธฐ์
๊ด๋ จ ์ ๋ฌธ๊ธฐ์ฌ ๊ฐ์ฑ ๋ถ๋ฅ ์ฐ๊ตฌ." ํ๊ณํ์ฐ๊ตฌ 47.4 (2022): 33-54.
์ ๋
ผ๋ฌธ์์ ์ ์ํ ๋ชจ๋ธ์ ๋ฐ์ ์์ผ huggingface๋ฅผ ํตํด ๊ณต๊ฐํฉ๋๋ค.
์ฐ๊ตฌ์ ์ฌ์ฉํ์ค ๊ฒฝ์ฐ ์ ํ์ดํผ๋ฅผ cite ํด ์ฃผ์๊ธฐ ๋ฐ๋๋๋ค.
ํด๋น ๋ชจ๋ธ์ https://huggingface.co/jhgan/ko-sroberta-multitask ๋ฅผ ์ฌ์ฉํ์ฌ finetuing ํ ๋ชจํ์
๋๋ค.
์ฌ์ฉ ์ฝ๋๋ ์๋ ๋งํฌ๋ฅผ ์ฐธ๊ณ ํ์
์
๊ตฌ๊ธ ์ฝ๋ฉ:
https://colab.research.google.com/drive/1ORzKUr94cPyc5jaRCAngbclm4Qb4DtdG
ํ์ฌ ๋ชจํ์ evaluation ๊ฒฐ๊ณผ๋ ๋ค์๊ณผ ๊ฐ์ต๋๋ค.
{'eval_loss': 0.7330707907676697,
'eval_f1': 0.8689251403360293,
'eval_runtime': 0.464,
'eval_samples_per_second': 2047.32,
'eval_steps_per_second': 17.241,
'epoch': 33.33}
์ ํ๋ ๊ธฐ์ค์ผ๋ก ๋
ผ๋ฌธ์ 85.7% ์ ๋นํด ์์นํ์์ผ๋, ์์นํญ์ด ํ์ ํ์ง๋ ์์ต๋๋ค.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. --> |