castorini
/

afriberta_small

Inference Endpoints

Model card Files Files and versions Community

afriberta_small / README.md

kelechi's picture

added model card

523fc48 over 3 years ago

|

2 kB

	Hugging Face's logo
	---
	language:
	- om
	- am
	- rw
	- rn
	- ha
	- ig
	- pcm
	- so
	- sw
	- ti
	- yo
	- multilingual
	datasets:

	---
	# AfriBERTa_small
	## Model description
	AfriBERTa small is a pretrained multilingual language model with around 97 million parameters.
	The model has 4 layers, 6 attention heads, 768 hidden units and 3072 feed forward size.
	The model was pretrained on 11 African languages namely - Afaan Oromoo (also called Oromo), Amharic, Gahuza (a mixed language containing Kinyarwanda and Kirundi), Hausa, Igbo, Nigerian Pidgin, Somali, Swahili, Tigrinya and Yorùbá.
	The model has been shown to obtain competitive downstream performances on text classification and Named Entity Recognition on several African languages, including those it was not pretrained on.


	## Intended uses & limitations

	#### How to use
	You can use this model with Transformers for any downstream task.
	For example, assuming we want to finetune this model on a token classification task, we do the following:

	```python
	>>> from transformers import AutoTokenizer, AutoModelForTokenClassification
	>>> tokenizer = AutoTokenizer.from_pretrained("castorini/afriberta_small")
	>>> model = AutoModelForTokenClassification.from_pretrained("castorini/afriberta_small")
	```

	#### Limitations and bias
	This model is possibly limited by its training dataset which are majorly obtained from news articles from a specific span of time.
	Thus, it may not generalize well.

	## Training data
	The model was trained on an aggregation of datasets from the BBC news website and Common Crawl.

	## Training procedure
	For information on training procedures, please refer to the AfriBERTa [paper]() or [repository](https://github.com/keleog/afriberta)

	### BibTeX entry and citation info
	```
	Kelechi Ogueji, Yuxin Zhu, Jimmy Lin.
	Small Data? No Problem! Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages
	Proceedings of the 1st workshop on Multilingual Representation Learning at EMNLP 2021
	```