
TensorFlow Model Garden LMs
community
AI & ML interests
Language Model Pretraining, TensorFlow Model Garden
model-garden-lms's activity

stefan-it
updated
10
models
2 months ago

model-garden-lms/teams-base-finewebs-801k
Updated
•
10

model-garden-lms/teams-base-finewebs-851k
Updated
•
11

model-garden-lms/teams-base-finewebs-901k
Updated
•
13

model-garden-lms/teams-base-finewebs-951k
Updated
•
9

model-garden-lms/teams-base-finewebs-1m
Updated
•
14

model-garden-lms/bert-base-token-dropping-finewebs-801k
Updated
•
10

model-garden-lms/bert-base-token-dropping-finewebs-851k
Updated
•
10

model-garden-lms/bert-base-token-dropping-finewebs-901k
Updated
•
12

model-garden-lms/bert-base-token-dropping-finewebs-951k
Updated
•
12

model-garden-lms/bert-base-token-dropping-finewebs-1m
Updated
•
9

stefan-it
updated
a
dataset
2 months ago
Post
1539
My latest project is the outcome of the last 2+ years working with TPUs from the amazing TPU Research Cloud (TRC) program and training Encoder-only LMs with the TensorFlow Model Garden library.
👉 Link: https://github.com/stefan-it/model-garden-lms
An overview of some features:
- Cheatsheet for setting-up a TPU VM Pod (with all necessary dependencies) to pretrain LMs with TF Model Garden
- Conversion scripts that convert TF Model Garden weights to Hugging Face Transformers-compatible models
- Supported architectures include BERT, BERT with Token Dropping and TEAMS
I also released BERT-based models pretrained on the great Hugging Face FineWeb and FineWeb-Edu datasets (10BT subset). With more to come!
👉 Model Hub Link: https://huggingface.co/model-garden-lms
If you find these resources useful, please give them a like!
Made from Bavarian Oberland with ❤️ and 🥨.
👉 Link: https://github.com/stefan-it/model-garden-lms
An overview of some features:
- Cheatsheet for setting-up a TPU VM Pod (with all necessary dependencies) to pretrain LMs with TF Model Garden
- Conversion scripts that convert TF Model Garden weights to Hugging Face Transformers-compatible models
- Supported architectures include BERT, BERT with Token Dropping and TEAMS
I also released BERT-based models pretrained on the great Hugging Face FineWeb and FineWeb-Edu datasets (10BT subset). With more to come!
👉 Model Hub Link: https://huggingface.co/model-garden-lms
If you find these resources useful, please give them a like!
Made from Bavarian Oberland with ❤️ and 🥨.

stefan-it
updated
5
models
3 months ago

stefan-it
authored
a
paper
over 1 year ago