It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [PlanTL-GOB-ES/roberta-base-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-base-bne) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 768 tokens - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("adriansanz/sitges10242608-4ep-rerankv3-sp") # Run inference sentences = [ 'Els establiments locals tenen un paper clau en el projecte de la targeta de fidelització, ja que són els que ofereixen descomptes i ofertes especials als consumidors que utilitzen la targeta.', 'Quin és el paper dels establiments locals en el projecte de la targeta de fidelització?', "Quins són els tractaments que beneficien la salut de l'empleat municipal que s'inclouen en l'ajuda?", ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Information Retrieval * Dataset: `dim_768` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.0517 | | cosine_accuracy@3 | 0.125 | | cosine_accuracy@5 | 0.1875 | | cosine_accuracy@10 | 0.3858 | | cosine_precision@1 | 0.0517 | | cosine_precision@3 | 0.0417 | | cosine_precision@5 | 0.0375 | | cosine_precision@10 | 0.0386 | | cosine_recall@1 | 0.0517 | | cosine_recall@3 | 0.125 | | cosine_recall@5 | 0.1875 | | cosine_recall@10 | 0.3858 | | cosine_ndcg@10 | 0.1821 | | cosine_mrr@10 | 0.122 | | **cosine_map@100** | **0.1462** | #### Information Retrieval * Dataset: `dim_512` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:----------| | cosine_accuracy@1 | 0.0453 | | cosine_accuracy@3 | 0.1142 | | cosine_accuracy@5 | 0.181 | | cosine_accuracy@10 | 0.3815 | | cosine_precision@1 | 0.0453 | | cosine_precision@3 | 0.0381 | | cosine_precision@5 | 0.0362 | | cosine_precision@10 | 0.0381 | | cosine_recall@1 | 0.0453 | | cosine_recall@3 | 0.1142 | | cosine_recall@5 | 0.181 | | cosine_recall@10 | 0.3815 | | cosine_ndcg@10 | 0.1753 | | cosine_mrr@10 | 0.1144 | | **cosine_map@100** | **0.139** | #### Information Retrieval * Dataset: `dim_256` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.0474 | | cosine_accuracy@3 | 0.1228 | | cosine_accuracy@5 | 0.2004 | | cosine_accuracy@10 | 0.3987 | | cosine_precision@1 | 0.0474 | | cosine_precision@3 | 0.0409 | | cosine_precision@5 | 0.0401 | | cosine_precision@10 | 0.0399 | | cosine_recall@1 | 0.0474 | | cosine_recall@3 | 0.1228 | | cosine_recall@5 | 0.2004 | | cosine_recall@10 | 0.3987 | | cosine_ndcg@10 | 0.1851 | | cosine_mrr@10 | 0.1217 | | **cosine_map@100** | **0.1457** | #### Information Retrieval * Dataset: `dim_128` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.0453 | | cosine_accuracy@3 | 0.1164 | | cosine_accuracy@5 | 0.1681 | | cosine_accuracy@10 | 0.3815 | | cosine_precision@1 | 0.0453 | | cosine_precision@3 | 0.0388 | | cosine_precision@5 | 0.0336 | | cosine_precision@10 | 0.0381 | | cosine_recall@1 | 0.0453 | | cosine_recall@3 | 0.1164 | | cosine_recall@5 | 0.1681 | | cosine_recall@10 | 0.3815 | | cosine_ndcg@10 | 0.1753 | | cosine_mrr@10 | 0.1146 | | **cosine_map@100** | **0.1393** | #### Information Retrieval * Dataset: `dim_64` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.0388 | | cosine_accuracy@3 | 0.097 | | cosine_accuracy@5 | 0.153 | | cosine_accuracy@10 | 0.347 | | cosine_precision@1 | 0.0388 | | cosine_precision@3 | 0.0323 | | cosine_precision@5 | 0.0306 | | cosine_precision@10 | 0.0347 | | cosine_recall@1 | 0.0388 | | cosine_recall@3 | 0.097 | | cosine_recall@5 | 0.153 | | cosine_recall@10 | 0.347 | | cosine_ndcg@10 | 0.1569 | | cosine_mrr@10 | 0.1011 | | **cosine_map@100** | **0.1268** | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 4,173 training samples * Columns: positive and anchor * Approximate statistics based on the first 1000 samples: | | positive | anchor | |:--------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | positive | anchor | |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------| | L'objectiu principal de la persona coordinadora de colònia felina és garantir el benestar dels animals de la colònia. | Quin és l'objectiu principal de la persona coordinadora de colònia felina? | | Es tracta d'una sala amb capacitat per a 125 persones, equipada amb un petit escenari, sistema de sonorització, pantalla per a projeccions, camerins i serveis higiènics (WC). | Quin és el nombre de persones que pot acollir la sala d'actes del Casal Municipal de la Gent Gran de Sitges? | | Aquest ajut pretén fomentar l’associacionisme empresarial local, per tal de disposar d’agrupacions, gremis o associacions representatives de l’activitat empresarial del municipi. | Quin és el paper de les empreses en aquest ajut? | * Loss: [MatryoshkaLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters: ```json { "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: epoch - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `gradient_accumulation_steps`: 16 - `num_train_epochs`: 4 - `lr_scheduler_type`: cosine - `warmup_ratio`: 0.2 - `bf16`: True - `tf32`: False - `load_best_model_at_end`: True - `optim`: adamw_torch_fused - `batch_sampler`: no_duplicates #### All Hyperparameters
