This model was trained for evaluating linguistic acceptability and grammaticality. The finetuning was carried out based off the bert-base-german-cased.

To use the model:

from transformers import pipeline

classifier = pipeline("text-classification", model = 'EIStakovskii/bert-base-german-cased_fluency')

print(classifier("Wissqween Sisssasde, adddddqwe12was Mdddilednberg war, 122huh?"))

Label_1 means ACCEPTABLE - the sentence is perfectly understandable by native speakers and has no serious grammatic and syntactic flaws.

Label_0 means NOT ACCEPTABLE - the sentence is flawed both orthographically and grammatically.

The model was trained on 50 thousand German sentences from the news_commentary dataset. Out of 50 thousand 25 thousand sentences were algorithmically corrupted using the open source Python library. The library was originally developed by aylliote, but it was slightly adapted for the purposes of this model.

Downloads last month
75
Safetensors
Model size
109M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train EIStakovskii/bert-base-german-cased_fluency