File size: 2,215 Bytes
e143fda
048df05
e143fda
 
 
 
048df05
e143fda
048df05
e143fda
048df05
e143fda
 
048df05
 
 
 
 
e143fda
 
 
 
 
 
 
 
048df05
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## Typesense Public Embedding Models
We maintain a repository of currently supported embedding models, and we welcome contributions from the community. If you have a model that you would like to add to our supported list, you can convert it to the ONNX format and create a Pull Request (PR) to include it.

### Convert a model to ONNX format

#### Converting a Hugging Face Transformers Model
To convert any model from Hugging Face to ONNX format, you can follow the instructions in [this link](https://huggingface.co/docs/transformers/serialization#export-to-onnx) using the ```optimum-cli```.
#### Converting a PyTorch Model
If you have a PyTorch model, you can use the ```torch.onnx``` APIs to convert it to the ONNX format. More information on the conversion process can be found  [here](https://pytorch.org/docs/stable/onnx.html).
#### Converting a Tensorflow Model
For Tensorflow models, you can utilize the tf2onnx tool to convert them to the ONNX format. Detailed guidance on this conversion can be found [here](https://onnxruntime.ai/docs/tutorials/tf-get-started.html#getting-started-converting-tensorflow-to-onnx). 

### Creating model config 
Before submitting your ONNX model through a PR, you need to organize the necessary files under a folder with the model's name. Ensure that your model configuration adheres to the following structure:

  - **Model File**: The ONNX model file.
  - **Vocab File**: The vocabulary file required for the model.
  - **Model Config File**: Named as config.json, this file should contain the following keys:
| Key | Description | Optional |
|-----|-------------|----------|
|model_md5| MD5 checksum of model file as string| No |
|vocab_md5| MD5 checksum of vocab file as string| No |
|model_type| Model type (currently only ```bert``` and ```xlm_roberta``` supported)| No |
|vocab_file_name| File name of vocab file| No |
|indexing_prefix| Prefix to be added before embedding documents| Yes |
|query_prefix| Prefix to be added before embedding queries | Yes |


Please make sure that the information in the configuration file is accurate and complete before submitting your PR.

We appreciate your contributions to expand our collection of supported embedding models!