File size: 3,932 Bytes
cf3b60f
 
 
 
 
 
 
 
 
82cd484
cf3b60f
3241ea3
 
b0125f1
 
1e4d5a8
 
3241ea3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5c636ee
 
b0125f1
 
bdff53b
 
 
 
ba96cfc
 
a68ead6
f926953
a68ead6
f926953
a68ead6
 
71e9e1c
 
 
 
 
cf3b60f
 
f039d93
a499325
f039d93
 
 
 
 
 
71e9e1c
 
82cd484
 
 
 
 
 
 
cf3b60f
 
 
 
 
 
3241ea3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
# Doc / guide: https://huggingface.co/docs/hub/model-cards
{}
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->
2024.4.4 Update

This model is a sentiment analysis model designed to determine the positive/neutral/negative sentiment of sentences included in corporate-related news.

This model is targeted to provide sentiment for "important news", as described in the paper mentioned following. So, the results may not be accurate for less important news.

For the importance of a news title, please use kwoncho/ko-sroberta-multitask-informative

It can be used as a Korean-based sentiment analysis model for the finance/management/accounting fields.

Example>

"Samsung's debt is increasing." --> Neutral. The mere increase in debt is not necessarily negative.

"Due to the failure of management strategy, Samsung's debt is increasing." --> Negative. Debt increase due to failure is considered negative.

Hyun Ji-won, Lee Jun-il, and Cho Hyun-kwon. "A Study on Sentiment Classification of Corporate-related News Articles Using KoBERT." Accounting Research 47.4 (2022): 33-54.

We have further developed the model proposed in the above paper and made it available through Huggingface. If you use it for research purposes, please cite the above paper.

This model was fine-tuned using https://huggingface.co/jhgan/ko-sroberta-multitask.

For the usage code, refer to the link below:

Google Colab: https://colab.research.google.com/drive/1ORzKUr94cPyc5jaRCAngbclm4Qb4DtdG

The current evaluation results of the model are as follows:

{'eval_loss': 0.7330707907676697, 'eval_f1': 0.8689251403360293, 'eval_runtime': 0.464, 'eval_samples_per_second': 2047.32, 'eval_steps_per_second': 17.241, 'epoch': 33.33}

While the accuracy has increased compared to the paper's 85.7%, the improvement is not significant.

2024.4.4 Update

์ด ๋ชจํ˜•์€ ๊ธฐ์—…๊ด€๋ จ ๋‰ด์Šค์— ํฌํ•จ๋œ ๋ฌธ์žฅ์˜ ๊ธ์ •/์ค‘๋ฆฝ/๋ถ€์ •์„ ํŒ๋‹จํ•˜๊ธฐ ์œ„ํ•œ ๊ฐ์„ฑ๋ถ„์„ ๋ชจํ˜•์ž…๋‹ˆ๋‹ค.

์ด ๋ชจํ˜•์€ ํ•˜๋‹จ ๋…ผ๋ฌธ์—์„œ ์„ค๋ช…ํ•œ ๋ฐ”์™€ ๊ฐ™์ด ์ค‘์š”ํ•œ ๋‰ด์Šค์˜ ๊ฐ์„ฑ๋ถ„์„๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•˜๋„๋ก ํ›ˆ๋ จ๋˜์—ˆ์œผ๋ฏ€๋กœ, ์ค‘์š”์„ฑ์ด ๋‚ฎ์€ ๋‰ด์Šค์— ๋Œ€ํ•œ ๊ฐ์„ฑ๋ถ„์„ ๊ฒฐ๊ณผ๋Š” ์ •ํ™•ํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ค‘์š”์„ฑ ํŒ๋ณ„ ๋ชจ๋ธ์€ ์•„๋ž˜ ๋งํฌ ๋ชจํ˜•์„ ์‚ฌ์šฉํ•˜์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

https://huggingface.co/kwoncho/ko-sroberta-multitask-informative

ํ•œ๊ตญ์–ด ๊ธฐ๋ฐ˜ ๊ธˆ์œต/๊ฒฝ์˜/ํšŒ๊ณ„ ๋ถ„์•ผ ๊ฐ์„ฑ๋ถ„์„ ๋ชจํ˜•์œผ๋กœ ์‚ฌ์šฉํ•˜์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค. 

์˜ˆ์‹œ>

์‚ผ์„ฑ์ „์ž์˜ ๋ถ€์ฑ„๊ฐ€ ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. --> ์ค‘๋ฆฝ (neutral). ๋ถ€์ฑ„์ฆ๊ฐ€ ์ž์ฒด๋Š” ๋ถ€์ •์ ์ด๋ผ๊ณ  ๋ณด๊ธฐ ์–ด๋ ค์›€

๊ฒฝ์˜์ „๋žต์˜ ์‹คํŒจ๋กœ ์‚ผ์„ฑ์ „์ž์˜ ๋ถ€์ฑ„๊ฐ€ ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. --> ๋ถ€์ • (negative). ์‹คํŒจ๋กœ ์ธํ•œ ๋ถ€์ฑ„ ์ฆ๊ฐ€๋Š” ๋ถ€์ •์ 

ํ˜„์ง€์›, ์ด์ค€์ผ, and ์กฐํ˜„๊ถŒ. "KoBERT ๋ฅผ ์ด์šฉํ•œ ๊ธฐ์—…๊ด€๋ จ ์‹ ๋ฌธ๊ธฐ์‚ฌ ๊ฐ์„ฑ ๋ถ„๋ฅ˜ ์—ฐ๊ตฌ." ํšŒ๊ณ„ํ•™์—ฐ๊ตฌ 47.4 (2022): 33-54.

์œ„ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ๋ชจ๋ธ์„ ๋ฐœ์ „์‹œ์ผœ huggingface๋ฅผ ํ†ตํ•ด ๊ณต๊ฐœํ•ฉ๋‹ˆ๋‹ค.
์—ฐ๊ตฌ์— ์‚ฌ์šฉํ•˜์‹ค ๊ฒฝ์šฐ ์œ„ ํŽ˜์ดํผ๋ฅผ cite ํ•ด ์ฃผ์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

ํ•ด๋‹น ๋ชจ๋ธ์€ https://huggingface.co/jhgan/ko-sroberta-multitask ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ finetuing ํ•œ ๋ชจํ˜•์ž…๋‹ˆ๋‹ค.

์‚ฌ์šฉ ์ฝ”๋“œ๋Š” ์•„๋ž˜ ๋งํฌ๋ฅผ ์ฐธ๊ณ ํ•˜์…”์š”

๊ตฌ๊ธ€ ์ฝ”๋žฉ:
https://colab.research.google.com/drive/1ORzKUr94cPyc5jaRCAngbclm4Qb4DtdG




ํ˜„์žฌ ๋ชจํ˜•์˜ evaluation ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

{'eval_loss': 0.7330707907676697,
 'eval_f1': 0.8689251403360293,
 'eval_runtime': 0.464,
 'eval_samples_per_second': 2047.32,
 'eval_steps_per_second': 17.241,
 'epoch': 33.33}
 
์ •ํ™•๋„ ๊ธฐ์ค€์œผ๋กœ ๋…ผ๋ฌธ์˜ 85.7% ์— ๋น„ํ•ด ์ƒ์Šนํ•˜์˜€์œผ๋‚˜, ์ƒ์Šนํญ์ด ํ˜„์ €ํ•˜์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->