Spaces:

ZurichNLP
/

unsupervised-semantic-diff

Sleeping

jvamvas commited on Oct 16, 2023

Commit

0cda4f1

1 Parent(s): 42eceac

Update description

Files changed (1) hide show

description.md CHANGED Viewed

@@ -8,9 +8,10 @@ This demo is a proof of concept for visualizing the semantic differences between
 The input documents may or may not be written in the same language.
 In our paper, we evaluate three simple, unsupervised approaches based on BERT-like encoder models.
-This demo implements the approaches `DiffAlign` and `DiffDel` using the model [ZurichNLP/unsup-simcse-xlm-roberta-base](https://huggingface.co/ZurichNLP/unsup-simcse-xlm-roberta-base). See the [XLM-R model](https://huggingface.co/xlm-roberta-base) for a list of supported languages.
-The third approach, `DiffMask`, was not included in the demo because it is very slow.
 More resources:
 - Paper: https://arxiv.org/abs/2305.13303

 The input documents may or may not be written in the same language.
 In our paper, we evaluate three simple, unsupervised approaches based on BERT-like encoder models.
+This demo implements the approaches `DiffAlign` and `DiffDel` using the model [ZurichNLP/unsup-simcse-xlm-roberta-base](https://huggingface.co/ZurichNLP/unsup-simcse-xlm-roberta-base). See the model tags for a list of the ~100 supported languages.
+- `DiffAlign` aligns the words of the two documents using cosine similarity between the word embeddings (cf. [SimAlign](http://dx.doi.org/10.18653/v1/2020.findings-emnlp.147), [BERTScore](https://openreview.net/forum?id=SkeHuCVFDr)). Words with low similarity are highlighted.
+- `DiffDel` calculates sentence similarity between the two input documents (cf. [SimCSE](http://dx.doi.org/10.18653/v1/2021.emnlp-main.552)). The algorithm highlights words whose deletion has a positive effect on the similarity score.
 More resources:
 - Paper: https://arxiv.org/abs/2305.13303