How can I get large corpus dataset (over 200 Millions of records) in a tsv file format to encode with intfloat/e5-large-v2 as an embedding model ?
#15 opened 10 months ago
by
liorf95
Comparison with multilingual-e5-large
#14 opened about 1 year ago
by
xuuxu
Single input vs Multiple inputs
1
#13 opened over 1 year ago
by
innovationTony
Possible Vector Collaps Issue
1
#10 opened over 1 year ago
by
Banso
Changing the dimensions of the embeddings
1
#9 opened over 1 year ago
by
Suijhin
Adding ONNX file of this model
#5 opened over 1 year ago
by
asifanchor
Adding `safetensors` variant of this model
#4 opened over 1 year ago
by
SFconvertbot

e5-large-v2 requirements for training in non english?
2
#3 opened over 1 year ago
by
wilfoderek

Which embedding vector to use?
8
#2 opened over 1 year ago
by
moooji
How can I support the max_length=2048
6
#1 opened almost 2 years ago
by
nlpdev3