--- license: gpl-3.0 datasets: - wmt/wmt19 language: - en - zh base_model: Mxode/NanoLM-365M-Base pipeline_tag: translation tags: - text-generation-inference --- # NanoTranslator-immersive_translate-365M English | [简体中文](README_zh-CN.md) ## Introduction NanoTranslator-immersive_translate-365M is a model specifically designed for **Chinese-English bilingual** translation, trained with 6M data from the [wmt-19](https://huggingface.co/datasets/wmt/wmt19) dataset, based on [NanoLM-365M-Base](https://huggingface.co/Mxode/NanoLM-365M-Base). This model is trained following the [Immersive Translate](https://immersivetranslate.com/) prompt format and can be deployed as an OpenAI format interface using tools like vllm and lmdeploy for utilization. ## How to use Below is a method to call the model using transformers. The prompt follows the immersive translation format to ensure optimal results. ```python import torch from typing import Literal from transformers import AutoModelForCausalLM, AutoTokenizer model_path = 'Mxode/NanoTranslator-immersive_translate-365M' model = AutoModelForCausalLM.from_pretrained(model_path).to('cuda:0', torch.bfloat16) tokenizer = AutoTokenizer.from_pretrained(model_path) def translate( text: str, to: Literal["chinese", "english"] = "chinese", **kwargs ): generation_args = dict( max_new_tokens = kwargs.pop("max_new_tokens", 512), do_sample = kwargs.pop("do_sample", True), temperature = kwargs.pop("temperature", 0.35), top_p = kwargs.pop("top_p", 0.8), top_k = kwargs.pop("top_k", 40), **kwargs ) prompt = """Translate the following source text to {to}. Output translation directly without any additional text. Source Text: {text} Translated Text:""" messages = [ {"role": "system", "content": "You are a professional, authentic machine translation engine."}, {"role": "user", "content": prompt.format(to=to, text=text)} ] inputs = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([inputs], return_tensors="pt").to(model.device) generated_ids = model.generate(model_inputs.input_ids, **generation_args) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] return response text = "After a long day at work, I love to unwind by cooking a nice dinner and watching my favorite TV series. It really helps me relax and recharge for the next day." response = translate(text=text, to='chinese') print(f'Translation: {response}') """ Translation: 工作了一天,我喜欢吃一顿美味的晚餐,看我最喜欢的电视剧,这样做有助于我放松,补充能量。 """ ```