ICDAR 2025 Competition on End-to-end Document Image Machine Translation (OCR-free Track)
This is the official repository for ICDAR 2025 Competition on End-to-end Document Image Machine Translation (OCR-free Track)
For Participants
Participants are required to translate all the images in the competition_testset_images.zip using an end-to-end OCR-free method into Simplified Chinese (zh-CN), and fill in answer.json file.
The file should be zipped and submitted to Codalab.
In answer.json file, the key corresponds to the image file name, and the value is the translation of each image as a single string (str) after jieba cut.
Training Dataset Download
The dataset can be downloaded from this huggingface link.
Baseline Implementation
This is an implementation of a simple end-to-end document image machine translation model with an image encoder and a translation decoder. Details can be found in Document Image Machine Translation with Dynamic Multi-pre-trained Models Assembling (NAACL 2024 Main) Section 5.3 Base.