zeyadusf's picture
Update README.md
ef51d3f verified
---
library_name: transformers
tags:
- nlp
- classification
license: apache-2.0
datasets:
- zeyadusf/daigt
language:
- en
base_model:
- FacebookAI/roberta-base
pipeline_tag: text-classification
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
## Model Details
- eval_loss : 0.02619364485144615,
- eval_accuracy: 0.9941391941391942,
- eval_f1-score: 0.9941391909936754,
- epoch : 2.0
```
Classification Report:
precision recall f1-score support
0 1.00 0.99 0.99 1365
1 0.99 1.00 0.99 1365
accuracy 0.99 2730
macro avg 0.99 0.99 0.99 2730
weighted avg 0.99 0.99 0.99 2730
````
![image/png](https://cdn-uploads.huggingface.co/production/uploads/65cf6fed0954f06e47d97e56/iTsf9imsdWtxtRaiMa5On.png)
#### Clean Function
- I used it when I tested manual the model and it gave good results when cleaning.
```python
import re
import html
def clean_text(text):
# Remove HTML tags
clean = re.compile('<.*?>')
text = re.sub(clean, '', text)
# Replace HTML entities with their corresponding characters
text = html.unescape(text)
# Remove extra whitespace and normalize spaces
text = re.sub(r'\s+', ' ', text).strip()
text = re.sub(r'[^a-zA-Z0-9\s]', '', text)
return re.sub("\s\s+", " ", text)
```