eli4s
/

prunedBert-L12-h384-A6-finetuned

Inference Endpoints

Model card Files Files and versions Community

eli4s commited on Jul 21, 2021

Commit

28e6b71

·

1 Parent(s): b40abcb

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -1,7 +1,9 @@
 This model was pretrained on the bookcorpus dataset using knowledge distillation.
 The particularity of this model is that even though it shares the same architecture as BERT, it has a hidden size of 384 (half the hidden size of BERT) and 6 attention heads (hence the same head size of BERT).
 The weights of the model were initialized by pruning the weights of bert-base-uncased.
 A knowledge distillation was performed using multiple loss functions to fine-tune the model.
 PS : the tokenizer is the same as the one of the model bert-base-uncased.

 This model was pretrained on the bookcorpus dataset using knowledge distillation.
 The particularity of this model is that even though it shares the same architecture as BERT, it has a hidden size of 384 (half the hidden size of BERT) and 6 attention heads (hence the same head size of BERT).
 The weights of the model were initialized by pruning the weights of bert-base-uncased.
 A knowledge distillation was performed using multiple loss functions to fine-tune the model.
 PS : the tokenizer is the same as the one of the model bert-base-uncased.