This model is a fine-tuned version of Leonard Konle's fiction-gbert.

It was fine-tuned for ten epochs on the Deutscher Roman Korpus (DROC) on literary character detection using a standard token-classification head. However, in deviation from most other models, this model detects named entities and nouns (matching both "Harry" and "Zauberer") referencing a character.

The model achieves a 92.12 / 89.98 % F1 score on the semi-official DROC validation and test sets.

The code to reproduce the dataset and training can be accessed via Github

Additional hyperparameters are:

  • Num Epochs: 10
  • Batch-Size: 8
  • Optimizer: AdamW
  • Learning rate: 2e-05
  • Weight-Decay: 0.1
  • Scheduler: Linear Warmup for the first 10 % of the training, with a linear decay for the remainder.
  • Precision: 32bit
  • Training-Framework: Trident

ID2Label-Map:

{
0: "O",
1: "B-PER",
2: "I-Per"
}
Downloads last month
27
Safetensors
Model size
335M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.