Update README.md
Browse files
README.md
CHANGED
@@ -11,6 +11,9 @@ metrics:
|
|
11 |
model-index:
|
12 |
- name: democracy-sentiment-analysis-turkish-roberta
|
13 |
results: []
|
|
|
|
|
|
|
14 |
---
|
15 |
|
16 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -28,16 +31,53 @@ It achieves the following results on the evaluation set:
|
|
28 |
|
29 |
## Model description
|
30 |
|
31 |
-
|
|
|
|
|
|
|
|
|
32 |
|
33 |
## Intended uses & limitations
|
34 |
|
35 |
-
|
36 |
|
37 |
## Training and evaluation data
|
38 |
|
39 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
|
|
41 |
## Training procedure
|
42 |
|
43 |
### Training hyperparameters
|
@@ -67,4 +107,4 @@ The following hyperparameters were used during training:
|
|
67 |
- Transformers 4.44.2
|
68 |
- Pytorch 2.4.0+cu121
|
69 |
- Datasets 2.21.0
|
70 |
-
- Tokenizers 0.19.1
|
|
|
11 |
model-index:
|
12 |
- name: democracy-sentiment-analysis-turkish-roberta
|
13 |
results: []
|
14 |
+
license: mit
|
15 |
+
language:
|
16 |
+
- tr
|
17 |
---
|
18 |
|
19 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
31 |
|
32 |
## Model description
|
33 |
|
34 |
+
This model is fine-tuned from the base model cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual for sentiment analysis in Turkish, specifically focusing on democracy-related text. The model classifies texts into three sentiment categories:
|
35 |
+
|
36 |
+
Positive
|
37 |
+
Neutral
|
38 |
+
Negative
|
39 |
|
40 |
## Intended uses & limitations
|
41 |
|
42 |
+
This model is well-suited for analyzing sentiments in Turkish texts that discuss democracy, governance, and related political discourse.
|
43 |
|
44 |
## Training and evaluation data
|
45 |
|
46 |
+
The training dataset consists of 30,000 rows gathered from various sources, including: Kaggle, Hugging Face, Ekşi Sözlük, and synthetic data generated using state-of-the-art LLMs.
|
47 |
+
The dataset is multilingual in origin, with texts in English, Russian, and Turkish. All non-Turkish texts were translated into Turkish. The data represents a broad spectrum of democratic discourse from 30 different sources.
|
48 |
+
|
49 |
+
## How to Use
|
50 |
+
|
51 |
+
To use this model for sentiment analysis, you can leverage the Hugging Face `pipeline` for text classification as shown below:
|
52 |
+
|
53 |
+
```python
|
54 |
+
from transformers import pipeline
|
55 |
+
|
56 |
+
# Load the model from Hugging Face
|
57 |
+
sentiment_model = pipeline(model="yeniguno/democracy-sentiment-analysis-turkish-roberta", task='text-classification')
|
58 |
+
|
59 |
+
# Example text input
|
60 |
+
response = sentiment_model("En iyisi devletin tüm gücünü tek bir lidere verelim")
|
61 |
+
|
62 |
+
# Print the result
|
63 |
+
print(response)
|
64 |
+
# [{'label': 'negative', 'score': 0.9617443084716797}]
|
65 |
+
|
66 |
+
# Example text input
|
67 |
+
response = sentiment_model("Birçok farklı sesin çıkması zaman alıcı ve karmaşık görünebilir, ancak demokrasinin getirdiği özgürlük ve çeşitlilik, toplumun gerçek gücüdür.")
|
68 |
+
|
69 |
+
# Print the result
|
70 |
+
print(response)
|
71 |
+
# [{'label': 'positive', 'score': 0.958978533744812}]
|
72 |
+
|
73 |
+
# Example text input
|
74 |
+
response = sentiment_model("Bugün hava yağmurlu.")
|
75 |
+
|
76 |
+
# Print the result
|
77 |
+
print(response)
|
78 |
+
# [{'label': 'neutral', 'score': 0.9915837049484253}]
|
79 |
|
80 |
+
```
|
81 |
## Training procedure
|
82 |
|
83 |
### Training hyperparameters
|
|
|
107 |
- Transformers 4.44.2
|
108 |
- Pytorch 2.4.0+cu121
|
109 |
- Datasets 2.21.0
|
110 |
+
- Tokenizers 0.19.1
|