Update README.md
Browse files
README.md
CHANGED
@@ -1,50 +1,50 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
widget:
|
4 |
-
- text: "Some ninja attacked the White House."
|
5 |
-
example_title: "Fake example 1"
|
6 |
-
language:
|
7 |
-
- en
|
8 |
-
tags:
|
9 |
-
- classification
|
10 |
-
datasets:
|
11 |
-
- "fake-and-real-news-dataset on kaggle"
|
12 |
-
---
|
13 |
-
## Overview
|
14 |
-
The model is a `roberta-base` fine-tuned on
|
15 |
-
The model takes a news article and predicts if it is true or fake.
|
16 |
-
The format of the input should be:
|
17 |
-
|
18 |
-
```
|
19 |
-
<title> TITLE HERE <content> CONTENT HERE <end>
|
20 |
-
```
|
21 |
-
|
22 |
-
## Using this model in your code
|
23 |
-
To use this model, first download it from the hugginface website:
|
24 |
-
```python
|
25 |
-
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
26 |
-
|
27 |
-
tokenizer = AutoTokenizer.from_pretrained("hamzab/roberta-fake-news-classification")
|
28 |
-
|
29 |
-
model = AutoModelForSequenceClassification.from_pretrained("hamzab/roberta-fake-news-classification")
|
30 |
-
```
|
31 |
-
|
32 |
-
Then, make a prediction like follows:
|
33 |
-
```python
|
34 |
-
import torch
|
35 |
-
def predict_fake(title,text):
|
36 |
-
input_str = "<title>" + title + "<content>" + text + "<end>"
|
37 |
-
input_ids = tokenizer.encode_plus(input_str, max_length=512, padding="max_length", truncation=True, return_tensors="pt")
|
38 |
-
device = 'cuda' if torch.cuda.is_available() else 'cpu'
|
39 |
-
model.to(device)
|
40 |
-
with torch.no_grad():
|
41 |
-
output = model(input_ids["input_ids"].to(device), attention_mask=input_ids["attention_mask"].to(device))
|
42 |
-
return dict(zip(["Fake","Real"], [x.item() for x in list(torch.nn.Softmax()(output.logits)[0])] ))
|
43 |
-
|
44 |
-
print(predict_fake(<HEADLINE-HERE>,<CONTENT-HERE>))
|
45 |
-
```
|
46 |
-
You can also use Gradio to test the model on real-time:
|
47 |
-
```python
|
48 |
-
import gradio as gr
|
49 |
-
iface = gr.Interface(fn=predict_fake, inputs=[gr.inputs.Textbox(lines=1,label="headline"),gr.inputs.Textbox(lines=6,label="content")], outputs="label").launch(share=True)
|
50 |
```
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
widget:
|
4 |
+
- text: "Some ninja attacked the White House."
|
5 |
+
example_title: "Fake example 1"
|
6 |
+
language:
|
7 |
+
- en
|
8 |
+
tags:
|
9 |
+
- classification
|
10 |
+
datasets:
|
11 |
+
- "fake-and-real-news-dataset on kaggle"
|
12 |
+
---
|
13 |
+
## Overview
|
14 |
+
The model is a `roberta-base` fine-tuned on [fake-and-real-news-dataset](https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset). It has a 100% accuracy on that dataset.
|
15 |
+
The model takes a news article and predicts if it is true or fake.
|
16 |
+
The format of the input should be:
|
17 |
+
|
18 |
+
```
|
19 |
+
<title> TITLE HERE <content> CONTENT HERE <end>
|
20 |
+
```
|
21 |
+
|
22 |
+
## Using this model in your code
|
23 |
+
To use this model, first download it from the hugginface website:
|
24 |
+
```python
|
25 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
26 |
+
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained("hamzab/roberta-fake-news-classification")
|
28 |
+
|
29 |
+
model = AutoModelForSequenceClassification.from_pretrained("hamzab/roberta-fake-news-classification")
|
30 |
+
```
|
31 |
+
|
32 |
+
Then, make a prediction like follows:
|
33 |
+
```python
|
34 |
+
import torch
|
35 |
+
def predict_fake(title,text):
|
36 |
+
input_str = "<title>" + title + "<content>" + text + "<end>"
|
37 |
+
input_ids = tokenizer.encode_plus(input_str, max_length=512, padding="max_length", truncation=True, return_tensors="pt")
|
38 |
+
device = 'cuda' if torch.cuda.is_available() else 'cpu'
|
39 |
+
model.to(device)
|
40 |
+
with torch.no_grad():
|
41 |
+
output = model(input_ids["input_ids"].to(device), attention_mask=input_ids["attention_mask"].to(device))
|
42 |
+
return dict(zip(["Fake","Real"], [x.item() for x in list(torch.nn.Softmax()(output.logits)[0])] ))
|
43 |
+
|
44 |
+
print(predict_fake(<HEADLINE-HERE>,<CONTENT-HERE>))
|
45 |
+
```
|
46 |
+
You can also use Gradio to test the model on real-time:
|
47 |
+
```python
|
48 |
+
import gradio as gr
|
49 |
+
iface = gr.Interface(fn=predict_fake, inputs=[gr.inputs.Textbox(lines=1,label="headline"),gr.inputs.Textbox(lines=6,label="content")], outputs="label").launch(share=True)
|
50 |
```
|