zeyadusf
/

roberta-DAIGT-kaggle

Text Classification

Inference Endpoints

Model card Files Files and versions Community

roberta-DAIGT-kaggle / README.md

zeyadusf's picture

Update README.md

ef51d3f verified about 2 months ago

|

history blame contribute delete

1.45 kB

	---
	library_name: transformers
	tags:
	- nlp
	- classification
	license: apache-2.0
	datasets:
	- zeyadusf/daigt
	language:
	- en
	base_model:
	- FacebookAI/roberta-base
	pipeline_tag: text-classification
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->



	## Model Details

	- eval_loss : 0.02619364485144615,
	- eval_accuracy: 0.9941391941391942,
	- eval_f1-score: 0.9941391909936754,
	- epoch : 2.0



	```
	Classification Report:
	precision recall f1-score support

	0 1.00 0.99 0.99 1365
	1 0.99 1.00 0.99 1365

	accuracy 0.99 2730
	macro avg 0.99 0.99 0.99 2730
	weighted avg 0.99 0.99 0.99 2730
	````

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/65cf6fed0954f06e47d97e56/iTsf9imsdWtxtRaiMa5On.png)


	#### Clean Function
	- I used it when I tested manual the model and it gave good results when cleaning.
	```python
	import re
	import html
	def clean_text(text):
	# Remove HTML tags
	clean = re.compile('<.*?>')
	text = re.sub(clean, '', text)
	# Replace HTML entities with their corresponding characters
	text = html.unescape(text)
	# Remove extra whitespace and normalize spaces
	text = re.sub(r'\s+', ' ', text).strip()
	text = re.sub(r'[^a-zA-Z0-9\s]', '', text)
	return re.sub("\s\s+", " ", text)
	```