DAIGT
Collection
Detection of AI-Generated Text
•
5 items
•
Updated
•
1
Classification Report:
precision recall f1-score support
0 1.00 0.99 0.99 1365
1 0.99 1.00 0.99 1365
accuracy 0.99 2730
macro avg 0.99 0.99 0.99 2730
weighted avg 0.99 0.99 0.99 2730
import re
import html
def clean_text(text):
# Remove HTML tags
clean = re.compile('<.*?>')
text = re.sub(clean, '', text)
# Replace HTML entities with their corresponding characters
text = html.unescape(text)
# Remove extra whitespace and normalize spaces
text = re.sub(r'\s+', ' ', text).strip()
text = re.sub(r'[^a-zA-Z0-9\s]', '', text)
return re.sub("\s\s+", " ", text)
Base model
FacebookAI/roberta-base