## Small print

Warning: This demo is highly experimental and not ready for production use.

This demo is a proof of concept for visualizing the semantic differences between two text documents. The input documents may or may not be written in the same language. In our paper, we evaluate three simple, unsupervised approaches based on BERT-like encoder models. This demo implements the approaches `DiffAlign` and `DiffDel` using the model [ZurichNLP/unsup-simcse-xlm-roberta-base](https://huggingface.co/ZurichNLP/unsup-simcse-xlm-roberta-base). See the [XLM-R model](https://huggingface.co/xlm-roberta-base) for a list of supported languages. The third approach, `DiffMask`, was not included in the demo because it is very slow. More resources: - Paper: https://arxiv.org/abs/2305.13303 - Code: https://github.com/ZurichNLP/recognizing-semantic-differences ## Citation ```bibtex @article{vamvas-sennrich-2023-rsd, title={Towards Unsupervised Recognition of Semantic Differences in Related Documents}, author={Jannis Vamvas and Rico Sennrich}, year={2023}, eprint={2305.13303}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```