Migrate model card from transformers-repo
Browse filesRead announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/neuraly/bert-base-italian-cased-sentiment/README.md
README.md
ADDED
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: it
|
3 |
+
thumbnail: https://neuraly.ai/static/assets/images/huggingface/thumbnail.png
|
4 |
+
tags:
|
5 |
+
- sentiment
|
6 |
+
- Italian
|
7 |
+
license: MIT
|
8 |
+
widget:
|
9 |
+
- text: "Huggingface è un team fantastico!"
|
10 |
+
---
|
11 |
+
|
12 |
+
# 🤗 + neuraly - Italian BERT Sentiment model
|
13 |
+
|
14 |
+
## Model description
|
15 |
+
|
16 |
+
This model performs sentiment analysis on Italian sentences. It was trained starting from an instance of [bert-base-italian-cased](https://huggingface.co/dbmdz/bert-base-italian-cased), and fine-tuned on an Italian dataset of tweets, reaching 82% of accuracy on the latter one.
|
17 |
+
|
18 |
+
## Intended uses & limitations
|
19 |
+
|
20 |
+
#### How to use
|
21 |
+
|
22 |
+
```python
|
23 |
+
import torch
|
24 |
+
from torch import nn
|
25 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
26 |
+
|
27 |
+
# Load the tokenizer
|
28 |
+
tokenizer = AutoTokenizer.from_pretrained("neuraly/bert-base-italian-cased-sentiment")
|
29 |
+
# Load the model, use .cuda() to load it on the GPU
|
30 |
+
model = AutoModelForSequenceClassification.from_pretrained("neuraly/bert-base-italian-cased-sentiment")
|
31 |
+
|
32 |
+
sentence = 'Huggingface è un team fantastico!'
|
33 |
+
input_ids = tokenizer.encode(sentence, add_special_tokens=True)
|
34 |
+
|
35 |
+
# Create tensor, use .cuda() to transfer the tensor to GPU
|
36 |
+
tensor = torch.tensor(input_ids).long()
|
37 |
+
# Fake batch dimension
|
38 |
+
tensor = tensor.unsqueeze(0)
|
39 |
+
|
40 |
+
# Call the model and get the logits
|
41 |
+
logits, = model(tensor)
|
42 |
+
|
43 |
+
# Remove the fake batch dimension
|
44 |
+
logits = logits.squeeze(0)
|
45 |
+
|
46 |
+
# The model was trained with a Log Likelyhood + Softmax combined loss, hence to extract probabilities we need a softmax on top of the logits tensor
|
47 |
+
proba = nn.functional.softmax(logits, dim=0)
|
48 |
+
|
49 |
+
# Unpack the tensor to obtain negative, neutral and positive probabilities
|
50 |
+
negative, neutral, positive = proba
|
51 |
+
```
|
52 |
+
|
53 |
+
#### Limitations and bias
|
54 |
+
|
55 |
+
A possible drawback (or bias) of this model is related to the fact that it was trained on a tweet dataset, with all the limitations that come with it. The domain is strongly related to football players and teams, but it works surprisingly well even on other topics.
|
56 |
+
|
57 |
+
## Training data
|
58 |
+
|
59 |
+
We trained the model by combining the two tweet datasets taken from [Sentipolc EVALITA 2016](http://www.di.unito.it/~tutreeb/sentipolc-evalita16/data.html). Overall the dataset consists of 45K pre-processed tweets.
|
60 |
+
|
61 |
+
The model weights come from a pre-trained instance of [bert-base-italian-cased](https://huggingface.co/dbmdz/bert-base-italian-cased). A huge "thank you" goes to that team, brilliant work!
|
62 |
+
|
63 |
+
## Training procedure
|
64 |
+
|
65 |
+
#### Preprocessing
|
66 |
+
|
67 |
+
We tried to save as much information as possible, since BERT captures extremely well the semantic of complex text sequences. Overall we removed only **@mentions**, **urls** and **emails** from every tweet and kept pretty much everything else.
|
68 |
+
|
69 |
+
#### Hardware
|
70 |
+
|
71 |
+
- **GPU**: Nvidia GTX1080ti
|
72 |
+
- **CPU**: AMD Ryzen7 3700x 8c/16t
|
73 |
+
- **RAM**: 64GB DDR4
|
74 |
+
|
75 |
+
#### Hyperparameters
|
76 |
+
|
77 |
+
- Optimizer: **AdamW** with learning rate of **2e-5**, epsilon of **1e-8**
|
78 |
+
- Max epochs: **5**
|
79 |
+
- Batch size: **32**
|
80 |
+
- Early Stopping: **enabled** with patience = 1
|
81 |
+
|
82 |
+
Early stopping was triggered after 3 epochs.
|
83 |
+
|
84 |
+
## Eval results
|
85 |
+
|
86 |
+
The model achieves an overall accuracy on the test set equal to 82%
|
87 |
+
The test set is a 20% split of the whole dataset.
|
88 |
+
|
89 |
+
## About us
|
90 |
+
[Neuraly](https://neuraly.ai) is a young and dynamic startup committed to designing AI-driven solutions and services through the most advanced Machine Learning and Data Science technologies. You can find out more about who we are and what we do on our [website](https://neuraly.ai).
|
91 |
+
|
92 |
+
## Acknowledgments
|
93 |
+
|
94 |
+
Thanks to the generous support from the [Hugging Face](https://huggingface.co/) team,
|
95 |
+
it is possible to download the model from their S3 storage and live test it from their inference API 🤗.
|