mt5-base-thaisum
This repository contains the finetuned mT5-base model for Thai sentence summarization. The architecture of the model is based on mT5 model and fine-tuned on text-summarization pairs in Thai.
Example
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
tokenizer = AutoTokenizer.from_pretrained("preechanon/mt5-base-thaisum-text-summarization")
model = AutoModelForSeq2SeqLM.from_pretrained("preechanon/mt5-base-thaisum-text-summarization")
new_input_string = "ข้อความที่ต้องการ"
input_ = tokenizer(new_input_string, truncation=True, max_length=1024, return_tensors="pt")
with torch.no_grad():
preds = model.generate(
input_['input_ids'].to('cpu'),
num_beams=15,
num_return_sequences=1,
no_repeat_ngram_size=1,
remove_invalid_values=True,
max_length=140,
)
summary = tokenizer.decode(preds[0], skip_special_tokens=True)
summary
Score
- Rouge1: 0.488931
- Rouge2: 0.309732
- Rougel: 0.425490
- Rougelsum: 0.444359
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-04
- train_batch_size: 8
- eval_batch_size: 1
- seed: 42
- optimizer: AdamW with betas=(0.9,0.999), epsilon=1e-08 and weight_decay=0.1
- warmup step: 5000
- lr_scheduler_type: linear
- num_epochs: 6
- gradient_accumulation_steps: 4
Framework versions
- Transformers 4.36.1
- Pytorch 2.1.2
Resource Funding
NSTDA Supercomputer center (ThaiSC) and the National e-Science Infrastructure Consortium for their support of computer facilities.
Citation
ปรีชานนท์ ชาติไทย และ สัจจวัจน์ ส่งเสริม. (2567),
การสรุปข้อความข่าวภาษาไทยด้วยโครงข่ายประสาทเทียม (Thai News Text Summarization Using Neural Network),
วิทยาศาสตรบัณฑิต (วทบ.):ขอนแก่น, มหาวิทยาลัยขอนแก่น
- Downloads last month
- 52
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.