nitsuai michaelfeil commited on
Commit
2365abf
0 Parent(s):

Duplicate from michaelfeil/ct2fast-paraphrase-multilingual-MiniLM-L12-v2

Browse files
.gitattributes ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.bin.* filter=lfs diff=lfs merge=lfs -text
2
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.h5 filter=lfs diff=lfs merge=lfs -text
5
+ *.tflite filter=lfs diff=lfs merge=lfs -text
6
+ *.tar.gz filter=lfs diff=lfs merge=lfs -text
7
+ *.ot filter=lfs diff=lfs merge=lfs -text
8
+ *.onnx filter=lfs diff=lfs merge=lfs -text
9
+ *.arrow filter=lfs diff=lfs merge=lfs -text
10
+ *.ftz filter=lfs diff=lfs merge=lfs -text
11
+ *.joblib filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.pb filter=lfs diff=lfs merge=lfs -text
15
+ *.pt filter=lfs diff=lfs merge=lfs -text
16
+ *.pth filter=lfs diff=lfs merge=lfs -text
17
+ pytorch_model.bin filter=lfs diff=lfs merge=lfs -text
18
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
19
+ unigram.json filter=lfs diff=lfs merge=lfs -text
20
+ .git/lfs/objects/8a/01/8a016203ad4fe42aaad6e9329f70e4ea2ea19d4e14e43f1a36ec140233e604ef filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,177 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: sentence-similarity
3
+ language: multilingual
4
+ license: apache-2.0
5
+ tags:
6
+ - ctranslate2
7
+ - int8
8
+ - float16
9
+ - sentence-transformers
10
+ - feature-extraction
11
+ - sentence-similarity
12
+ - transformers
13
+ ---
14
+ # # Fast-Inference with Ctranslate2
15
+ Speedup inference while reducing memory by 2x-4x using int8 inference in C++ on CPU or GPU.
16
+
17
+ quantized version of [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2)
18
+ ```bash
19
+ pip install hf-hub-ctranslate2>=2.12.0 ctranslate2>=3.17.1
20
+ ```
21
+
22
+ ```python
23
+ # from transformers import AutoTokenizer
24
+ model_name = "michaelfeil/ct2fast-paraphrase-multilingual-MiniLM-L12-v2"
25
+ model_name_orig="sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
26
+
27
+ from hf_hub_ctranslate2 import EncoderCT2fromHfHub
28
+ model = EncoderCT2fromHfHub(
29
+ # load in int8 on CUDA
30
+ model_name_or_path=model_name,
31
+ device="cuda",
32
+ compute_type="int8_float16"
33
+ )
34
+ outputs = model.generate(
35
+ text=["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
36
+ max_length=64,
37
+ ) # perform downstream tasks on outputs
38
+ outputs["pooler_output"]
39
+ outputs["last_hidden_state"]
40
+ outputs["attention_mask"]
41
+
42
+ # alternative, use SentenceTransformer Mix-In
43
+ # for end-to-end Sentence embeddings generation
44
+ # (not pulling from this CT2fast-HF repo)
45
+
46
+ from hf_hub_ctranslate2 import CT2SentenceTransformer
47
+ model = CT2SentenceTransformer(
48
+ model_name_orig, compute_type="int8_float16", device="cuda"
49
+ )
50
+ embeddings = model.encode(
51
+ ["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
52
+ batch_size=32,
53
+ convert_to_numpy=True,
54
+ normalize_embeddings=True,
55
+ )
56
+ print(embeddings.shape, embeddings)
57
+ scores = (embeddings @ embeddings.T) * 100
58
+
59
+ # Hint: you can also host this code via REST API and
60
+ # via github.com/michaelfeil/infinity
61
+
62
+
63
+ ```
64
+
65
+ Checkpoint compatible to [ctranslate2>=3.17.1](https://github.com/OpenNMT/CTranslate2)
66
+ and [hf-hub-ctranslate2>=2.12.0](https://github.com/michaelfeil/hf-hub-ctranslate2)
67
+ - `compute_type=int8_float16` for `device="cuda"`
68
+ - `compute_type=int8` for `device="cpu"`
69
+
70
+ Converted on 2023-10-13 using
71
+ ```
72
+ LLama-2 -> removed <pad> token.
73
+ ```
74
+
75
+ # Licence and other remarks:
76
+ This is just a quantized version. Licence conditions are intended to be idential to original huggingface repo.
77
+
78
+ # Original description
79
+
80
+
81
+ # sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
82
+
83
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.
84
+
85
+
86
+
87
+ ## Usage (Sentence-Transformers)
88
+
89
+ Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
90
+
91
+ ```
92
+ pip install -U sentence-transformers
93
+ ```
94
+
95
+ Then you can use the model like this:
96
+
97
+ ```python
98
+ from sentence_transformers import SentenceTransformer
99
+ sentences = ["This is an example sentence", "Each sentence is converted"]
100
+
101
+ model = SentenceTransformer('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2')
102
+ embeddings = model.encode(sentences)
103
+ print(embeddings)
104
+ ```
105
+
106
+
107
+
108
+ ## Usage (HuggingFace Transformers)
109
+ Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
110
+
111
+ ```python
112
+ from transformers import AutoTokenizer, AutoModel
113
+ import torch
114
+
115
+
116
+ #Mean Pooling - Take attention mask into account for correct averaging
117
+ def mean_pooling(model_output, attention_mask):
118
+ token_embeddings = model_output[0] #First element of model_output contains all token embeddings
119
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
120
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
121
+
122
+
123
+ # Sentences we want sentence embeddings for
124
+ sentences = ['This is an example sentence', 'Each sentence is converted']
125
+
126
+ # Load model from HuggingFace Hub
127
+ tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2')
128
+ model = AutoModel.from_pretrained('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2')
129
+
130
+ # Tokenize sentences
131
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
132
+
133
+ # Compute token embeddings
134
+ with torch.no_grad():
135
+ model_output = model(**encoded_input)
136
+
137
+ # Perform pooling. In this case, max pooling.
138
+ sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
139
+
140
+ print("Sentence embeddings:")
141
+ print(sentence_embeddings)
142
+ ```
143
+
144
+
145
+
146
+ ## Evaluation Results
147
+
148
+
149
+
150
+ For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2)
151
+
152
+
153
+
154
+ ## Full Model Architecture
155
+ ```
156
+ SentenceTransformer(
157
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
158
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
159
+ )
160
+ ```
161
+
162
+ ## Citing & Authors
163
+
164
+ This model was trained by [sentence-transformers](https://www.sbert.net/).
165
+
166
+ If you find this model helpful, feel free to cite our publication [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084):
167
+ ```bibtex
168
+ @inproceedings{reimers-2019-sentence-bert,
169
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
170
+ author = "Reimers, Nils and Gurevych, Iryna",
171
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
172
+ month = "11",
173
+ year = "2019",
174
+ publisher = "Association for Computational Linguistics",
175
+ url = "http://arxiv.org/abs/1908.10084",
176
+ }
177
+ ```
config.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "old_models/paraphrase-multilingual-MiniLM-L12-v2/0_Transformer",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "gradient_checkpointing": false,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 1536,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "transformers_version": "4.7.0",
21
+ "type_vocab_size": 2,
22
+ "use_cache": true,
23
+ "vocab_size": 250037,
24
+ "bos_token": "<s>",
25
+ "eos_token": "</s>",
26
+ "layer_norm_epsilon": 1e-12,
27
+ "unk_token": "<unk>"
28
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.0.0",
4
+ "transformers": "4.7.0",
5
+ "pytorch": "1.9.0+cu102"
6
+ }
7
+ }
model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c59f14a4df4ce59c1b82b6438ba6d3fbf5cb744ec3c9d49efab61cec2ea9425
3
+ size 235315884
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": false}}
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c3387be76557bd40970cec13153b3bbf80407865484b209e655e5e4729076b8
3
+ size 9081518
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"do_lower_case": true, "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "tokenize_chinese_chars": true, "strip_accents": null, "bos_token": "<s>", "eos_token": "</s>", "model_max_length": 512, "special_tokens_map_file": null, "name_or_path": "old_models/paraphrase-multilingual-MiniLM-L12-v2/0_Transformer"}
unigram.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:71b44701d7efd054205115acfa6ef126c5d2f84bd3affe0c59e48163674d19a6
3
+ size 14763234
vocabulary.json ADDED
The diff for this file is too large to render. See raw diff
 
vocabulary.txt ADDED
The diff for this file is too large to render. See raw diff