File size: 2,621 Bytes

45d8c45
cbb23c7
45d8c45
 
4e40d29
45d8c45
 
 
 
 
 
 
 
 
 
 
 
cbb23c7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45d8c45
 
 
 
0fd24a5
ef366e9
25d73ca
7e026e4
ef366e9
0fd24a5
ef366e9
0fd24a5
ef366e9
0fd24a5
ef366e9
0fd24a5
ef366e9
0fd24a5
ef366e9
0fd24a5
ef366e9
0fd24a5
 
 
 
 
 
 
 
 
 
 
 
ef366e9
0fd24a5
ef366e9
12c993a
 
ef366e9
0fd24a5
ef366e9
0fd24a5
 
 
ef366e9
0fd24a5
ef366e9
0fd24a5
ef366e9
0fd24a5
 
ef366e9
0fd24a5
 
ef366e9
0fd24a5
 
280cc2c

---
language:
- en
- ko
license: apache-2.0
library_name: transformers
tags:
- translation
- t5
- en-to-ko
datasets:
- aihub-koen-translation-integrated-base-10m
metrics:
- bleu
model-index:
- name: traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko
  results:
  - task:
      name: Translation
      type: translation
    dataset:
      name: AIHub KO-EN Translation Integrated Base (10M)
      type: aihub-koen-translation-integrated-base-10m
    metrics:
    - name: BLEU
      type: bleu
      value: 18.838066
      epoch: 2
    - name: BLEU
      type: bleu
      value: 18.006119
      epoch: 1
---



# Model Description

This model, named **traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko**, is a machine translation model that translates English to Korean. It is fine-tuned from the [KETI-AIR/ke-t5-base](https://huggingface.co/KETI-AIR/ke-t5-base) model using the [aihub-koen-translation-integrated-base-10m](https://huggingface.co/datasets/traintogpb/aihub-koen-translation-integrated-base-10m) dataset.


## Model Architecture

The model uses the ke-t5-base architecture, which is based on the T5 (Text-to-Text Transfer Transformer) model.

## Training Data

The model was trained on the aihub-koen-translation-integrated-base-10m dataset, which is designed for English-to-Korean translation tasks.

## Training Procedure

### Training Parameters

The model was trained with the following parameters:
- Learning Rate: 0.0005
- Weight Decay: 0.01
- Batch Size: 64 (training), 128 (evaluation)
- Number of Epochs: 2
- Save Steps: 500
- Max Save Checkpoints: 2
- Evaluation Strategy: At the end of each epoch
- Logging Strategy: No logging
- Use of FP16: No
- Gradient Accumulation Steps: 2
- Reporting: None

### Hardware

The training was performed on a single GPU system with an NVIDIA A100 (40GB).


## Performance

The model achieved the following BLEU scores during training:
- Epoch 1: 18.006119
- Epoch 2: 18.838066

## Usage

This model is suitable for applications involving translation from English to Korean. Here is an example on how to use this model in Hugging Face's Transformers:

```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained("traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko")
tokenizer = AutoTokenizer.from_pretrained("traintogpb-ke-t5-base-aihub-koen-translation-integrated-10m-en-to-ko")

inputs = tokenizer.encode("This is a sample text.", return_tensors="pt")
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))