eli4s
/

prunedBert-L12-h384-A6-finetuned

Inference Endpoints

Model card Files Files and versions Community

prunedBert-L12-h384-A6-finetuned / README.md

eli4s's picture

Update README.md

700f42e over 3 years ago

|

1.72 kB

	This model was pretrained on the bookcorpus dataset using knowledge distillation.

	The particularity of this model is that even though it shares the same architecture as BERT, it has a hidden size of 384 (half the hidden size of BERT) and 6 attention heads (hence the same head size of BERT).

	The weights of the model were initialized by pruning the weights of bert-base-uncased.

	A knowledge distillation was performed using multiple loss functions to fine-tune the model.

	PS : the tokenizer is the same as the one of the model bert-base-uncased.

	## PS2 : this model still needs a little more finetuning, I will keep updating it regularly.


	To load the model \& tokenizer :

	````python
	from transformers import AutoModelForMaskedLM, BertTokenizer

	model_name = "eli4s/Bert-L12-h384-A6-pruned"
	model = AutoModelForMaskedLM.from_pretrained(model_name)
	tokenizer = BertTokenizer.from_pretrained(model_name)
	````

	To use it on a sentence :

	````python
	import torch

	sentence = "Let's have a [MASK]."

	encoded_inputs = tokenizer([sentence], padding='longest')
	input_ids = torch.tensor(encoded_inputs['input_ids'])
	attention_mask = torch.tensor(encoded_inputs['attention_mask'])
	output = model(input_ids, attention_mask=attention_mask)

	mask_index = input_ids.tolist()[0].index(103)
	masked_token = output['logits'][0][mask_index].argmax(axis=-1)
	predicted_token = tokenizer.decode(masked_token)

	print(predicted_token)
	````

	Or we can also predict the n most relevant predictions :

	````python
	top_n = 5

	vocab_size = model.config.vocab_size
	logits = output['logits'][0][mask_index].tolist()
	top_tokens = sorted(list(range(vocab_size)), key=lambda i:logits[i], reverse=True)[:top_n]

	tokenizer.decode(top_tokens)
	````