Update README.md
Browse files
README.md
CHANGED
@@ -44,8 +44,8 @@ output = model(**encoded_input)
|
|
44 |
|
45 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
46 |
|
47 |
-
- 30M entries from `TucanoBR/GigaVerbo
|
48 |
-
- 107M sequences of 128
|
49 |
- tokenizer: WordPiece
|
50 |
- vocab_size: 32768
|
51 |
- seq_length: 128
|
|
|
44 |
|
45 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
46 |
|
47 |
+
- 30M entries from `TucanoBR/GigaVerbo`
|
48 |
+
- 107M sequences of length 128
|
49 |
- tokenizer: WordPiece
|
50 |
- vocab_size: 32768
|
51 |
- seq_length: 128
|