vera-8
/

mT5-base-trimmed_deplain-apa

Text2Text Generation

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Model Card for mT5-base-trimmed_deplain-apa

Finetuned mT5-Model for German sentence-level text-simplification.

Model Details

Model Description

Model type: Encoder-Decoder-Transformer
Language(s) (NLP): German
Finetuned from model: google/mT5-base
Task: Text-Simplification

Training Details

Training Data

DEplain/DEplain-APA-sent
Stodden et al. (2023):arXiv:2305.18939

Training Procedure

Parameter-efficient Fine-Tuning with LoRA. Vocabulary trimmed to 32.000 most frequent tokens for German.

Training Hyperparameters

Batch Size: 16
Epochs: 1
Learning Rate: 0.001
Optimizer: Adafactor

LoRA Hyperparameters

R: 32
Alpha: 64
Dropout: 0.1
Target modules: all linear layers

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Examples

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train vera-8/mT5-base-trimmed_deplain-apa