Commit
·
d2fe8b6
1
Parent(s):
360ec4b
Update README.md
Browse files
README.md
CHANGED
@@ -22,8 +22,8 @@ By removing the decoder we can *half the original number of parameters* (thus ha
|
|
22 |
|
23 |
## Table of Contents
|
24 |
|
25 |
-
|
26 |
-
|
27 |
|
28 |
## Why use T5ForSequenceClassification?
|
29 |
Models based on the [BERT](https://huggingface.co/bert-large-uncased) architecture like [RoBERTa](https://huggingface.co/roberta-large) and [DeBERTa](https://huggingface.co/microsoft/deberta-v2-xxlarge) have shown very strong performance on sequence classification task and are still widely used today.
|
|
|
22 |
|
23 |
## Table of Contents
|
24 |
|
25 |
+
0. [Why use T5ForSequenceClassification?](##why-use-t5forsequenceclassification?)
|
26 |
+
1. [T5ForClassification vs T5](##t5forclassification-vs-t5)
|
27 |
|
28 |
## Why use T5ForSequenceClassification?
|
29 |
Models based on the [BERT](https://huggingface.co/bert-large-uncased) architecture like [RoBERTa](https://huggingface.co/roberta-large) and [DeBERTa](https://huggingface.co/microsoft/deberta-v2-xxlarge) have shown very strong performance on sequence classification task and are still widely used today.
|