introduction
This is an undergraduate course project in computer security.
The task is to fine tune the large model to achieve malicious network flow data detection.
base model
bert-base-uncased
dataset:
19kmunz/iot-23-preprocessed-minimumcolumns
example prompt:
8081 tcp S0 2 80 0
37215 tcp S0 2 80 0
52869 tcp S0 2 80 0
8080 tcp S0 2 80 0
80 tcp S0 2 80 0
The above are "malicious", which is "label_1".
67 udp S0 11 3608 0
0 icmp OTH 9 844 0
136 icmp OTH 3 216 0
0 icmp OTH 8 648 0
134 icmp OTH 2 96 0
The above are "Benign", which is "label_0".
accuracy
Training Loss Valid. Loss Valid. Accur.
epoch
1 0.288545 0.190351 0.929988
2 0.147658 0.154426 0.943510
3 0.108059 0.173112 0.943510
4 0.092468 0.161035 0.947416
MCC score: 0.816
The "Total MCC" refers to the Matthews Correlation Coefficient (MCC), typically used to assess the quality of predictions in binary classification problems.
The MCC value ranges from -1 to 1, where 1 signifies perfect predictions, 0 indicates predictions similar to random chance, and -1 denotes completely opposite predictions.
A model with an MCC value of 0.816 can be considered quite good. This value being close to 1 implies that the model has a high predictive capability and can classify samples with considerable accuracy. A higher MCC value closer to 1 indicates stronger predictive ability in the model.
In summary, an MCC value of 0.816 indicates that the model demonstrates a high level of accuracy and predictive capability in binary classification tasks.
- Downloads last month
- 7