namdp-ptit
/

ViRanker

@@ -47,7 +47,7 @@ Get relevance scores (higher scores indicate more relevance):
 ```python
 from FlagEmbedding import FlagReranker
-reranker = FlagReranker('namdp/bge-reranker-vietnamese',
                         use_fp16=True)  # Setting use_fp16 to True speeds up computation with a slight performance degradation
 score = reranker.compute_score(['tỉnh nào có diện tích lớn nhất việt nam', 'nghệ an có diện tích lớn nhất việt nam'])
@@ -89,8 +89,8 @@ Get relevance scores (higher scores indicate more relevance):
 import torch
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
-tokenizer = AutoTokenizer.from_pretrained('namdp/bge-reranker-vietnamese')
-model = AutoModelForSequenceClassification.from_pretrained('namdp/bge-reranker-vietnamese')
 model.eval()
 pairs = [
@@ -115,4 +115,32 @@ Train data should be a json file, where each line is a dict like this:
 `query` is the query, and `pos` is a list of positive texts, `neg` is a list of negative texts, `prompt` indicates the
 relationship between query and texts. If you have no negative texts for a query, you can random sample some from the
-entire corpus as the negatives.

 ```python
 from FlagEmbedding import FlagReranker
+reranker = FlagReranker('namdp/ViRanker',
                         use_fp16=True)  # Setting use_fp16 to True speeds up computation with a slight performance degradation
 score = reranker.compute_score(['tỉnh nào có diện tích lớn nhất việt nam', 'nghệ an có diện tích lớn nhất việt nam'])
 import torch
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained('namdp/ViRanker')
+model = AutoModelForSequenceClassification.from_pretrained('namdp/ViRanker')
 model.eval()
 pairs = [
 `query` is the query, and `pos` is a list of positive texts, `neg` is a list of negative texts, `prompt` indicates the
 relationship between query and texts. If you have no negative texts for a query, you can random sample some from the
+entire corpus as the negatives.
+## Performance
+In the following table, we provide various pre-trained Cross-Encoders together with their performance on
+the [MS MMarco Passage Reranking - Vi - Dev](https://huggingface.co/datasets/unicamp-dl/mmarco) dataset.
+| Model-Name                                                                                                                              | NDCG@3     | MRR@3      | NDCG@5     | MRR@5      | NDCG@10    | MRR@10     | Docs / Sec |
+|-----------------------------------------------------------------------------------------------------------------------------------------|:-----------|:-----------|:-----------|:-----------|:-----------|:-----------|:-----------|
+| [namdp/ViRanker](https://huggingface.co/namdp/ViRanker)                                                                                 | **0.6685** | **0.6564** | 0.6842     | **0.6811** | 0.7278     | **0.6985** | 2.02
+| [itdainb/PhoRankere](https://huggingface.co/itdainb/PhoRanker)                                                                          | 0.6625     | 0.6458     | **0.7147** | 0.6731     | **0.7422** | 0.6830     | **15**
+| [kien-vu-uet/finetuned-phobert-passage-rerank-best-eval](https://huggingface.co/kien-vu-uet/finetuned-phobert-passage-rerank-best-eval) | 0.0963     | 0.0883     | 0.1396     | 0.1131     | 0.1681     | 0.1246     | **15**
+| [BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3)                                                               | 0.6087     | 0.5841     | 0.6513     | 0.6062     | 0.6872     | 0.62091    | 3.51
+| [BAAI/bge-reranker-v2-gemma](https://huggingface.co/BAAI/bge-reranker-v2-gemma)                                                         | 0.6088     | 0.5908     | 0.6446     | 0.6108     | 0.6785     | 0.6249     | 1.29
+## Citation
+Please cite as
+```Plaintext
+@misc{ViRanker,
+  title={ViRanker: A Cross-encoder Model for Vietnamese Text Ranking},
+  author={Nam Dang Phuong},
+  year={2024},
+  publisher={Huggingface},
+  journal={huggingface repository},
+  howpublished={\url{https://huggingface.co/namdp/ViRanker}},
+}
+```