mt5-base-thaisum
This repository contains the finetuned mT5-base model for Thai sentence summarization. The architecture of the model is based on mT5 model and fine-tuned on text-summarization pairs in Thai.
Example
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
tokenizer = AutoTokenizer.from_pretrained("preechanon/mt5-base-thaisum-text-summarization")
model = AutoModelForSeq2SeqLM.from_pretrained("preechanon/mt5-base-thaisum-text-summarization")
new_input_string = "ข้อความที่ต้องการ"
input_ = tokenizer(new_input_string, truncation=True, max_length=1024, return_tensors="pt")
with torch.no_grad():
preds = model.generate(
input_['input_ids'].to('cpu'),
num_beams=15,
num_return_sequences=1,
no_repeat_ngram_size=1,
remove_invalid_values=True,
max_length=140,
)
summary = tokenizer.decode(preds[0], skip_special_tokens=True)
summary
Score
- Rouge1: 0.488931
- Rouge2: 0.309732
- Rougel: 0.425490
- Rougelsum: 0.444359
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-04
- train_batch_size: 8
- eval_batch_size: 1
- seed: 42
- optimizer: AdamW with betas=(0.9,0.999), epsilon=1e-08 and weight_decay=0.1
- warmup step: 5000
- lr_scheduler_type: linear
- num_epochs: 6
- gradient_accumulation_steps: 4
Framework versions
- Transformers 4.36.1
- Pytorch 2.1.2
Resource Funding
NSTDA Supercomputer center (ThaiSC) and the National e-Science Infrastructure Consortium for their support of computer facilities.
Citation
ปรีชานนท์ ชาติไทย และ สัจจวัจน์ ส่งเสริม. (2567),
การสรุปข้อความข่าวภาษาไทยด้วยโครงข่ายประสาทเทียม (Thai News Text Summarization Using Neural Network),
วิทยาศาสตรบัณฑิต (วทบ.):ขอนแก่น, มหาวิทยาลัยขอนแก่น
- Downloads last month
- 219
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.