Edit model card

output

This model is a fine-tuned version of google/mt5-base on dataset x-tech/cantonese-mandarin-translations.

Model description

The model translates Cantonese sentences to Mandarin.

Intended uses & limitations

When you use the model, please make sure to add translate cantonese to mandarin: <sentence> (please note the space after colon) before the text you want to translate.

Training and evaluation data

Training Dataset: x-tech/cantonese-mandarin-translations

Training procedure

Training is based on example in transformers library

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Since we still need to set up validation set, we do not have any training results yet.

Framework versions

  • Transformers 4.12.5
  • Pytorch 1.8.1
  • Datasets 1.15.1
  • Tokenizers 0.10.3
Downloads last month
143
Safetensors
Model size
582M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for botisan-ai/mt5-translate-yue-zh

Base model

google/mt5-base
Finetuned
(135)
this model

Dataset used to train botisan-ai/mt5-translate-yue-zh