julien-c HF staff commited on
Commit
12df143
·
1 Parent(s): a6f2777

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/neuraly/bert-base-italian-cased-sentiment/README.md

Files changed (1) hide show
  1. README.md +95 -0
README.md ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: it
3
+ thumbnail: https://neuraly.ai/static/assets/images/huggingface/thumbnail.png
4
+ tags:
5
+ - sentiment
6
+ - Italian
7
+ license: MIT
8
+ widget:
9
+ - text: "Huggingface è un team fantastico!"
10
+ ---
11
+
12
+ # 🤗 + neuraly - Italian BERT Sentiment model
13
+
14
+ ## Model description
15
+
16
+ This model performs sentiment analysis on Italian sentences. It was trained starting from an instance of [bert-base-italian-cased](https://huggingface.co/dbmdz/bert-base-italian-cased), and fine-tuned on an Italian dataset of tweets, reaching 82% of accuracy on the latter one.
17
+
18
+ ## Intended uses & limitations
19
+
20
+ #### How to use
21
+
22
+ ```python
23
+ import torch
24
+ from torch import nn
25
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
26
+
27
+ # Load the tokenizer
28
+ tokenizer = AutoTokenizer.from_pretrained("neuraly/bert-base-italian-cased-sentiment")
29
+ # Load the model, use .cuda() to load it on the GPU
30
+ model = AutoModelForSequenceClassification.from_pretrained("neuraly/bert-base-italian-cased-sentiment")
31
+
32
+ sentence = 'Huggingface è un team fantastico!'
33
+ input_ids = tokenizer.encode(sentence, add_special_tokens=True)
34
+
35
+ # Create tensor, use .cuda() to transfer the tensor to GPU
36
+ tensor = torch.tensor(input_ids).long()
37
+ # Fake batch dimension
38
+ tensor = tensor.unsqueeze(0)
39
+
40
+ # Call the model and get the logits
41
+ logits, = model(tensor)
42
+
43
+ # Remove the fake batch dimension
44
+ logits = logits.squeeze(0)
45
+
46
+ # The model was trained with a Log Likelyhood + Softmax combined loss, hence to extract probabilities we need a softmax on top of the logits tensor
47
+ proba = nn.functional.softmax(logits, dim=0)
48
+
49
+ # Unpack the tensor to obtain negative, neutral and positive probabilities
50
+ negative, neutral, positive = proba
51
+ ```
52
+
53
+ #### Limitations and bias
54
+
55
+ A possible drawback (or bias) of this model is related to the fact that it was trained on a tweet dataset, with all the limitations that come with it. The domain is strongly related to football players and teams, but it works surprisingly well even on other topics.
56
+
57
+ ## Training data
58
+
59
+ We trained the model by combining the two tweet datasets taken from [Sentipolc EVALITA 2016](http://www.di.unito.it/~tutreeb/sentipolc-evalita16/data.html). Overall the dataset consists of 45K pre-processed tweets.
60
+
61
+ The model weights come from a pre-trained instance of [bert-base-italian-cased](https://huggingface.co/dbmdz/bert-base-italian-cased). A huge "thank you" goes to that team, brilliant work!
62
+
63
+ ## Training procedure
64
+
65
+ #### Preprocessing
66
+
67
+ We tried to save as much information as possible, since BERT captures extremely well the semantic of complex text sequences. Overall we removed only **@mentions**, **urls** and **emails** from every tweet and kept pretty much everything else.
68
+
69
+ #### Hardware
70
+
71
+ - **GPU**: Nvidia GTX1080ti
72
+ - **CPU**: AMD Ryzen7 3700x 8c/16t
73
+ - **RAM**: 64GB DDR4
74
+
75
+ #### Hyperparameters
76
+
77
+ - Optimizer: **AdamW** with learning rate of **2e-5**, epsilon of **1e-8**
78
+ - Max epochs: **5**
79
+ - Batch size: **32**
80
+ - Early Stopping: **enabled** with patience = 1
81
+
82
+ Early stopping was triggered after 3 epochs.
83
+
84
+ ## Eval results
85
+
86
+ The model achieves an overall accuracy on the test set equal to 82%
87
+ The test set is a 20% split of the whole dataset.
88
+
89
+ ## About us
90
+ [Neuraly](https://neuraly.ai) is a young and dynamic startup committed to designing AI-driven solutions and services through the most advanced Machine Learning and Data Science technologies. You can find out more about who we are and what we do on our [website](https://neuraly.ai).
91
+
92
+ ## Acknowledgments
93
+
94
+ Thanks to the generous support from the [Hugging Face](https://huggingface.co/) team,
95
+ it is possible to download the model from their S3 storage and live test it from their inference API 🤗.