namdp-ptit
commited on
Commit
•
7fae5e8
1
Parent(s):
bc8d82d
Update README.md
Browse files
README.md
CHANGED
@@ -116,6 +116,8 @@ Train data should be a json file, where each line is a dict like this:
|
|
116 |
`query` is the query, and `pos` is a list of positive texts, `neg` is a list of negative texts. If you have no negative
|
117 |
texts for a query, you can random sample some from the entire corpus as the negatives.
|
118 |
|
|
|
|
|
119 |
## Performance
|
120 |
|
121 |
Below is a comparision table of the results we achieved compared to some other pre-trained Cross-Encoders on
|
|
|
116 |
`query` is the query, and `pos` is a list of positive texts, `neg` is a list of negative texts. If you have no negative
|
117 |
texts for a query, you can random sample some from the entire corpus as the negatives.
|
118 |
|
119 |
+
Besides, for each query in the train data, we used LLMs to generate hard negative for them by asking LLMs to create a document that is the opposite one of the documents in 'pos'.
|
120 |
+
|
121 |
## Performance
|
122 |
|
123 |
Below is a comparision table of the results we achieved compared to some other pre-trained Cross-Encoders on
|