Fork of thenlper/gte-small with ONNX to work with Transformers.js.
gte-small
General Text Embeddings (GTE) model. Towards General Text Embeddings with Multi-stage Contrastive Learning
The GTE models are trained by Alibaba DAMO Academy. They are mainly based on the BERT framework and currently offer three different sizes of models, including GTE-large, GTE-base, and GTE-small. The GTE models are trained on a large-scale corpus of relevance text pairs, covering a wide range of domains and scenarios. This enables the GTE models to be applied to various downstream tasks of text embeddings, including information retrieval, semantic textual similarity, text reranking, etc.
Metrics
Performance of the GTE models compared with other popular text embedding models on the MTEB benchmark. For more detailed comparison results, please refer to the MTEB leaderboard.
Model Name | Model Size (GB) | Dimension | Sequence Length | Average (56) | Clustering (11) | Pair Classification (3) | Reranking (4) | Retrieval (15) | STS (10) | Summarization (1) | Classification (12) |
---|---|---|---|---|---|---|---|---|---|---|---|
gte-large | 0.67 | 1024 | 512 | 63.13 | 46.84 | 85.00 | 59.13 | 52.22 | 83.35 | 31.66 | 73.33 |
gte-base | 0.22 | 768 | 512 | 62.39 | 46.2 | 84.57 | 58.61 | 51.14 | 82.3 | 31.17 | 73.01 |
e5-large-v2 | 1.34 | 1024 | 512 | 62.25 | 44.49 | 86.03 | 56.61 | 50.56 | 82.05 | 30.19 | 75.24 |
e5-base-v2 | 0.44 | 768 | 512 | 61.5 | 43.80 | 85.73 | 55.91 | 50.29 | 81.05 | 30.28 | 73.84 |
gte-small | 0.07 | 384 | 512 | 61.36 | 44.89 | 83.54 | 57.7 | 49.46 | 82.07 | 30.42 | 72.31 |
text-embedding-ada-002 | - | 1536 | 8192 | 60.99 | 45.9 | 84.89 | 56.32 | 49.25 | 80.97 | 30.8 | 70.93 |
e5-small-v2 | 0.13 | 384 | 512 | 59.93 | 39.92 | 84.67 | 54.32 | 49.04 | 80.39 | 31.16 | 72.94 |
sentence-t5-xxl | 9.73 | 768 | 512 | 59.51 | 43.72 | 85.06 | 56.42 | 42.24 | 82.63 | 30.08 | 73.42 |
all-mpnet-base-v2 | 0.44 | 768 | 514 | 57.78 | 43.69 | 83.04 | 59.36 | 43.81 | 80.28 | 27.49 | 65.07 |
sgpt-bloom-7b1-msmarco | 28.27 | 4096 | 2048 | 57.59 | 38.93 | 81.9 | 55.65 | 48.22 | 77.74 | 33.6 | 66.19 |
all-MiniLM-L12-v2 | 0.13 | 384 | 512 | 56.53 | 41.81 | 82.41 | 58.44 | 42.69 | 79.8 | 27.9 | 63.21 |
all-MiniLM-L6-v2 | 0.09 | 384 | 512 | 56.26 | 42.35 | 82.37 | 58.04 | 41.95 | 78.9 | 30.81 | 63.05 |
contriever-base-msmarco | 0.44 | 768 | 512 | 56.00 | 41.1 | 82.54 | 53.14 | 41.88 | 76.51 | 30.36 | 66.68 |
sentence-t5-base | 0.22 | 768 | 512 | 55.27 | 40.21 | 85.18 | 53.09 | 33.63 | 81.14 | 31.39 | 69.81 |
Usage
Deno
import { env, pipeline } from "https://cdn.jsdelivr.net/npm/@xenova/[email protected]";
// Some config for Deno
env.useBrowserCache = false;
env.allowLocalModels = false;
// Give it any input you want
const input = "Hello AI";
// Create the pipeline
const pipe = await pipeline(
"feature-extraction",
"koxy-ai/gte-small"
);
// Generate the embedding
const output = await pipe(input, {
pooling: "mean",
normalize: true
});
// Extract the embedding from the output
const embedding = Array.from(output.data);
// Do anything with the embedding
console.log(embedding);
Browser
Using Javascript modules.
<script type="module">
import { pipeline } from "https://cdn.jsdelivr.net/npm/@xenova/[email protected]";
// Create the pipeline
const setPipe = async () => {
return await pipeline(
"feature-extraction",
"koxy-ai/gte-small"
);
};
const generateEmbedding = async (input) => {
const pipe = await setPipe();
const output = await pipe(input, {
pooling: "mean",
normalize: true
});
return Array.from(output.data);
};
export default generateEmbedding;
</script>
Node JS
npm i @xenova/transformers
import { pipeline } from "@xenova/transformers";
(async () => {
// Give it any input you want
const input = "Hello AI";
// Create the pipeline
const pipe = await pipeline(
"feature-extraction",
"koxy-ai/gte-small"
);
// Generate the embedding
const output = await pipe(input, {
pooling: "mean",
normalize: true
});
// Extract the embedding from the output
const embedding = Array.from(output.data);
// Do anything with the embedding
console.log(embedding);
})();
Limitation
This model exclusively caters to English texts, and any lengthy texts will be truncated to a maximum of 512 tokens.
Citation
@misc{li2023general,
title={Towards General Text Embeddings with Multi-stage Contrastive Learning},
author={Zehan Li and Xin Zhang and Yanzhao Zhang and Dingkun Long and Pengjun Xie and Meishan Zhang},
year={2023},
eprint={2308.03281},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 4