--- pipeline_tag: sentence-similarity language: fr datasets: - stsb_multi_mt tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers license: mit model-index: - name: sentence-croissant-llm-base by Wissam Siblini results: - task: name: Sentence-Embedding type: Text Similarity dataset: name: Text Similarity fr type: stsb_multi_mt args: fr metrics: - name: Test Pearson correlation coefficient type: Pearson_correlation_coefficient value: xx.xx --- # Overview The model [sentence-croissant-llm-base](https://huggingface.co/Wissam42/sentence-croissant-llm-base) is designed to generate French text embeddings. It has been fine-tuned using the very recent pre-trained LLM [croissantllm/CroissantLLMBase](https://huggingface.co/croissantllm/CroissantLLMBase) with the strategy of Siamese-BERT implemented in the library ['sentences-transformers'](https://www.sbert.net/). The fine tuning dataset used is the French training split of [stsb](https://huggingface.co/datasets/stsb_multi_mt/viewer/fr/train). ## Usage (Sentence-Transformers) Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed: ``` pip install -U sentence-transformers ``` Then you can use the model like this: ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("Wissam42/sentence-croissant-llm-base") sentences = ["Le chat mange la souris", "Un felin devore un rongeur", "Je travaille sur un ordinateur", "Je developpe sur mon pc"] embeddings = model.encode(sentences) print(embeddings) ``` ## Citing & Authors @article{faysse2024croissantllm, title={CroissantLLM: A Truly Bilingual French-English Language Model}, author={Faysse, Manuel and Fernandes, Patrick and Guerreiro, Nuno and Loison, Ant{\'o}nio and Alves, Duarte and Corro, Caio and Boizard, Nicolas and Alves, Jo{\~a}o and Rei, Ricardo and Martins, Pedro and others}, journal={arXiv preprint arXiv:2402.00786}, year={2024} } @article{reimers2019sentence, title={Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks}, author={Nils Reimers, Iryna Gurevych}, journal={https://arxiv.org/abs/1908.10084}, year={2019} }