--- library_name: transformers tags: - nlp - classification license: apache-2.0 datasets: - zeyadusf/daigt language: - en base_model: - FacebookAI/roberta-base pipeline_tag: text-classification --- # Model Card for Model ID ## Model Details - eval_loss : 0.02619364485144615, - eval_accuracy: 0.9941391941391942, - eval_f1-score: 0.9941391909936754, - epoch : 2.0 ``` Classification Report: precision recall f1-score support 0 1.00 0.99 0.99 1365 1 0.99 1.00 0.99 1365 accuracy 0.99 2730 macro avg 0.99 0.99 0.99 2730 weighted avg 0.99 0.99 0.99 2730 ```` ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65cf6fed0954f06e47d97e56/iTsf9imsdWtxtRaiMa5On.png) #### Clean Function - I used it when I tested manual the model and it gave good results when cleaning. ```python import re import html def clean_text(text): # Remove HTML tags clean = re.compile('<.*?>') text = re.sub(clean, '', text) # Replace HTML entities with their corresponding characters text = html.unescape(text) # Remove extra whitespace and normalize spaces text = re.sub(r'\s+', ' ', text).strip() text = re.sub(r'[^a-zA-Z0-9\s]', '', text) return re.sub("\s\s+", " ", text) ```