Edit model card

模型介绍

  • 使用模型:t5-3b
  • 使用数据:wmt16 ro-en(共数据610320,使用了其中2000)
  • 使用分布式工具:deepspeed
  • 显卡:一张RTX 4090,24G
  • 目标:训练模型在ro-en上的翻译能力

使用方法

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

t5_qa_model = AutoModelForSeq2SeqLM.from_pretrained("snowfly/t5-3b-wmt16-ro-en")
t5_tok = AutoTokenizer.from_pretrained("snowfly/t5-3b-wmt16-ro-en")

input_ids = t5_tok("When was Franklin D. Roosevelt born?",
                   return_tensors="pt").input_ids
gen_output = t5_qa_model.generate(input_ids)[0]

print(t5_tok.decode(gen_output, skip_special_tokens=True))
Downloads last month
4
Safetensors
Model size
2.95B params
Tensor type
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Dataset used to train snowfly/t5-3b-wmt16-ro-en