Andrija
/

SRoBERTa-L

Inference Endpoints

Model card Files Files and versions Community

SRoBERTa-L / README.md

Andrija's picture

Add "multilingual" to the language tag (#1)

fbaa893 over 1 year ago

|

history blame contribute delete

No virus

674 Bytes

	---
	datasets:
	- oscar
	- srwac
	- leipzig

	language:
	- hr
	- sr
	- multilingual

	tags:
	- masked-lm

	widget:
	- text: "Ovo je početak <mask>."

	license: apache-2.0

	---
	# Transformer language model for Croatian and Serbian
	Trained on 6GB datasets that contain Croatian and Serbian language for two epochs (500k steps).
	Leipzig, OSCAR and srWac datasets

	\| Model \| #params \| Arch. \| Training data \|
	\|--------------------------------\|--------------------------------\|-------\|-----------------------------------\|
	\| `Andrija/SRoBERTa-L` \| 80M \| Third \| Leipzig Corpus, OSCAR and srWac (6 GB of text) \|