procesaur
/

gpt2-srlat-sem

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Model is developed in support of the University of Belgrade doctoral dissertation "Composite pseudogrammars based on parallel language models of Serbian" by Mihailo Škorić.

It generates semantically masked (lemmatized and without stopwords) sentences for Serbian.

This small gpt-2 model was fine-tuned on several corpora for Serbian (augmented using Latent semantic analysis methods).

The corpora include "The corpus of Contemporary Serbian", SrpELTeC and WikiKorpus by JeRTeh – Society for Language Resources and Technologies.

This model is purely experimental! For actual models for Serbian see GPT2-ORAO and GPT2-VRABAC
If you use this model for your reseach please cite: https://doi.org/10.3390/math11224660

Downloads last month: 19

Safetensors

Model size

138M params

Tensor type

F32

·

U8

·

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including procesaur/gpt2-srlat-sem

Eksperimentalni modeli

4 items • Updated Sep 27