BGE m3 Uzbek Legal Matryoshka

This is a sentence-transformers model finetuned from BAAI/bge-m3 on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: uz
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("fitlemon/bge-m3-uz-legal-matryoshka")
# Run inference
sentences = [
    "Intizomiy jazolar qanday qo'llaniladi?",
    'Intizomiy javobgarlik xodim tomonidan intizomiy qilmish (ushbu Kodeks 301-moddasining \nikkinchi qismi) sodir etilganligi uchun yuzaga keladigan va ushbu xodimga nisbatan intizomiy jazo \nchorasi qo‘llanilishida ifodalanadigan yuridik javobgarlikdir. \nIntizomiy javobgarlikning turlari umumiy va maxsus intizomiy javobgarlikdan iboratdir. \nUmumiy intizomiy javobgarlik ushbu Kodeks va ichki mehnat tartibi qoidalari bilan tartibga \nsolinadigan javobgarlik bo‘lib, u xodimga nisbatan ushbu Kodeksning 312-moddasida nazarda \ntutilgan intizomiy jazo choralaridan birini qo‘llashdan iborat va barcha xodimlarga nisbatan  tatbiq \netiladi, bundan o‘zi uchun maxsus intizomiy javobgarlik belgilangan shaxslar mustasno. \nMaxsus intizomiy javobgarlik xodimlarning faqat alohida toifalari uchun qonunda, \nshuningdek intizom to‘g‘risidagi ustavlar va nizomlarda nazarda tutilgan hamda xodimga nisbatan tegishli qonunda, intizom haqidagi ustavda va nizomda nazarda tutilgan intizomi y choralarni \nqo‘llashdan iborat javobgarlikdir.',
    'Ushbu Kodeksda nazarda tuti lgan asoslardan tashqari, O‘zbekiston Respublikasi hududida \nmehnat faoliyatini amalga oshirish huquqiga doir tasdiqnomaning amal qilish muddati tugatilganligi \nyoki bekor qilinganligi chet el fuqarosi bilan tuzilgan mehnat shartnomasini bekor qilish uchun a sos \nbo‘ladi.  \nO‘zbekiston Respublikasi hududida mehnat faoliyatini amalga oshirish huquqiga doir \ntasdiqnomaning amal qilish muddati tugashi munosabati bilan uning amal qilish muddati tugagan \nkunda mehnat shartnomasi bekor qilinishi lozim.  \nO‘zbekiston Resp ublikasi hududida mehnat faoliyatini amalga oshirish huquqiga doir \ntasdiqnoma bekor qilinganda mehnat shartnomasi ish beruvchi O‘zbekiston Respublikasi Bandlik va \nmehnat munosabatlari vazirligi huzuridagi Tashqi mehnat migratsiyasi agentligining tegishli \nxabarnomasini olgan kunda bekor qilinishi lozim.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric dim_1024 dim_768 dim_512 dim_256 dim_128 dim_64
cosine_accuracy@1 0.4915 0.4839 0.4858 0.4687 0.4668 0.482
cosine_accuracy@3 0.8178 0.8083 0.8027 0.7894 0.7837 0.7552
cosine_accuracy@5 0.8956 0.8918 0.8843 0.8805 0.8653 0.8558
cosine_accuracy@10 0.9469 0.9431 0.9431 0.9374 0.9279 0.9241
cosine_precision@1 0.4915 0.4839 0.4858 0.4687 0.4668 0.482
cosine_precision@3 0.2726 0.2694 0.2676 0.2631 0.2612 0.2517
cosine_precision@5 0.1791 0.1784 0.1769 0.1761 0.1731 0.1712
cosine_precision@10 0.0947 0.0943 0.0943 0.0937 0.0928 0.0924
cosine_recall@1 0.4915 0.4839 0.4858 0.4687 0.4668 0.482
cosine_recall@3 0.8178 0.8083 0.8027 0.7894 0.7837 0.7552
cosine_recall@5 0.8956 0.8918 0.8843 0.8805 0.8653 0.8558
cosine_recall@10 0.9469 0.9431 0.9431 0.9374 0.9279 0.9241
cosine_ndcg@10 0.7326 0.7268 0.7245 0.712 0.7055 0.7055
cosine_mrr@10 0.6621 0.6556 0.6529 0.6382 0.6328 0.6348
cosine_map@100 0.6646 0.6584 0.6555 0.6411 0.6362 0.6383

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 4,737 training samples
  • Columns: question and chunk
  • Approximate statistics based on the first 1000 samples:
    question chunk
    type string string
    details
    • min: 10 tokens
    • mean: 22.31 tokens
    • max: 45 tokens
    • min: 30 tokens
    • mean: 269.53 tokens
    • max: 520 tokens
  • Samples:
    question chunk
    Noqulay tabiiy-iqlim sharoitlaridagi ish uchun mehnatga haq to‘lash koeffitsiyentlari haqida {chapter} va {section}da nima deyiladi? Noqulay tabiiy-iqlim sharoitlaridagi ish uchun mehnatga haq to‘lash koeffitsiyenti ayrim
    hududlardagi mehnat sharoitlarining xususiyatlari inobatga olingan holda xodimlarga to‘lanadigan
    kompensatsiya xususiyatiga ega bo‘lgan ustama turidir. Koeffitsiyentlarning eng kam miqdorlari va
    ularni qo‘llash tartibi O‘zbekiston Respublikasi Vazirlar Mahkamasi tomonidan belgilanadi.
    28-bob. Xodim mehnatining xususiyati bilan bog‘liq bo‘lgan mehnatni huquqiy jihatdan
    tartibga solishning o‘ziga xos xususiyatlari
    1-§. Tashkilot rahbarining, uning o‘rinbosarlarining, tashkilot bosh buxgalterining va
    tashkilot alohida bo‘linmasi rahbarining mehnatini huquqiy jihatdan tartibga solishning
    o‘ziga xos xususiyatlari
    Homiladorlik yoki farzandlar borligi bilan bog‘liq sabablarga ko‘ra ishga qabul qilmaslik qonunga xilofmi? Ishga qabul qilishni qonunga xilof ravishda rad etishga yo‘l qo‘yilmaydi.
    Quyidagilar ishga qabul qilishni qonunga xilof ravishda rad etishdir:
    mehnat va mashg‘ulotlar sohasida kamsitishni taqiqlash to‘g‘risidagi talablarni buzish;
    ish beruvchi tomonidan ishga taklif etilgan shaxslarni ishga qabul qilmaslik;
    ish beruvchi qonunga muvofiq mehnat shartnomasini tuzishi shart bo‘lgan shaxslarni (ish
    o‘rinlarining belgilangan eng k am soni hisobiga ishga yuborilgan shaxslarni, ish beruvchi alohida
    asoslar bo‘yicha mehnat shartnomasini bekor qilgan shaxslarni, ular qayta ishga qabul qilingan
    taqdirda va boshqalarni) ishga qabul qilmaslik; homiladorlik yoki farzandlar borligi bilan bog‘liq sabablarga ko‘ra ishga qabul qilmaslik;
    Oldingi tahrirga qarang.
    sudlanganligi, shu jumladan tugallangan va olib tashlangan sudlanganligi sababli shaxslarni
    ishga qabul qi lmaslik, bundan qonunchilikda nazarda tutilgan hollar mustasno, yoxud shaxslarni
    ularning qarindoshlari sudlanganligi, shu j...
    Xizmat safariga yuborish uchun nogironligi bo‘lgan xodimlarning roziligi qanday ahamiyatga ega {chapter} va {section}da? Nogironligi bo‘lgan xodimlarni xizmat safariga yuborishga, tungi ishlarga, ish vaqtidan
    tashqari ishlarga hamda dam olish va ishlanmaydigan bayram kunlaridagi ishlarga jalb qilishga faqat
    ularning roziligi bilan, agar ushbu xodimlar uchun bunday ishlar tibbiy -ijtimoiy ekspert komissiyasi
    tavsiyalarida taqiqlanmagan bo‘lsa, yo‘l qo‘yiladi.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            1024,
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • fp16: True
  • tf32: False
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss dim_1024_cosine_ndcg@10 dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
0.0169 10 2.468 - - - - - -
0.0337 20 3.0476 - - - - - -
0.0506 30 2.0878 - - - - - -
0.0675 40 2.1392 - - - - - -
0.0843 50 2.3539 - - - - - -
0.1012 60 2.367 - - - - - -
0.1180 70 1.6277 - - - - - -
0.1349 80 2.0299 - - - - - -
0.1518 90 1.7242 - - - - - -
0.1686 100 1.5173 - - - - - -
0.1855 110 1.6598 - - - - - -
0.2024 120 1.0105 - - - - - -
0.2192 130 1.4555 - - - - - -
0.2361 140 0.3602 - - - - - -
0.2530 150 0.8921 - - - - - -
0.2698 160 0.6716 - - - - - -
0.2867 170 0.7469 - - - - - -
0.3035 180 1.0121 - - - - - -
0.3204 190 0.5142 - - - - - -
0.3373 200 3.1294 - - - - - -
0.3541 210 1.5231 - - - - - -
0.3710 220 0.8671 - - - - - -
0.3879 230 1.8561 - - - - - -
0.4047 240 1.3404 - - - - - -
0.4216 250 1.4122 - - - - - -
0.4384 260 1.1002 - - - - - -
0.4553 270 1.2149 - - - - - -
0.4722 280 2.1969 - - - - - -
0.4890 290 1.0054 - - - - - -
0.5059 300 1.1025 - - - - - -
0.5228 310 0.7316 - - - - - -
0.5396 320 1.648 - - - - - -
0.5565 330 1.0714 - - - - - -
0.5734 340 0.4809 - - - - - -
0.5902 350 0.3814 - - - - - -
0.6071 360 1.6564 - - - - - -
0.6239 370 1.0824 - - - - - -
0.6408 380 0.9652 - - - - - -
0.6577 390 1.2368 - - - - - -
0.6745 400 1.7904 - - - - - -
0.6914 410 0.8324 - - - - - -
0.7083 420 1.655 - - - - - -
0.7251 430 1.1056 - - - - - -
0.7420 440 1.6926 - - - - - -
0.7589 450 1.1129 - - - - - -
0.7757 460 1.3117 - - - - - -
0.7926 470 0.9942 - - - - - -
0.8094 480 1.0258 - - - - - -
0.8263 490 1.5217 - - - - - -
0.8432 500 0.8946 - - - - - -
0.8600 510 0.8951 - - - - - -
0.8769 520 0.5615 - - - - - -
0.8938 530 1.6551 - - - - - -
0.9106 540 0.2336 - - - - - -
0.9275 550 0.9635 - - - - - -
0.9444 560 0.6709 - - - - - -
0.9612 570 0.9109 - - - - - -
0.9781 580 1.7157 - - - - - -
0.9949 590 1.2344 - - - - - -
1.0 593 - 0.6764 0.6677 0.6679 0.6539 0.6370 0.6155
1.0118 600 0.2666 - - - - - -
1.0287 610 0.8451 - - - - - -
1.0455 620 0.7215 - - - - - -
1.0624 630 0.3526 - - - - - -
1.0793 640 0.7853 - - - - - -
1.0961 650 0.8001 - - - - - -
1.1130 660 0.2595 - - - - - -
1.1298 670 0.4533 - - - - - -
1.1467 680 0.5806 - - - - - -
1.1636 690 0.4952 - - - - - -
1.1804 700 0.7964 - - - - - -
1.1973 710 0.7831 - - - - - -
1.2142 720 0.5613 - - - - - -
1.2310 730 0.6759 - - - - - -
1.2479 740 0.4816 - - - - - -
1.2648 750 0.5292 - - - - - -
1.2816 760 0.3936 - - - - - -
1.2985 770 0.7487 - - - - - -
1.3153 780 0.8308 - - - - - -
1.3322 790 0.3518 - - - - - -
1.3491 800 0.8495 - - - - - -
1.3659 810 1.0201 - - - - - -
1.3828 820 0.7711 - - - - - -
1.3997 830 0.6631 - - - - - -
1.4165 840 0.8094 - - - - - -
1.4334 850 0.5915 - - - - - -
1.4503 860 0.689 - - - - - -
1.4671 870 0.3538 - - - - - -
1.4840 880 0.4916 - - - - - -
1.5008 890 1.0626 - - - - - -
1.5177 900 0.7237 - - - - - -
1.5346 910 0.5194 - - - - - -
1.5514 920 0.682 - - - - - -
1.5683 930 0.452 - - - - - -
1.5852 940 0.8517 - - - - - -
1.6020 950 0.3138 - - - - - -
1.6189 960 1.0786 - - - - - -
1.6358 970 0.683 - - - - - -
1.6526 980 0.288 - - - - - -
1.6695 990 0.4779 - - - - - -
1.6863 1000 0.5353 - - - - - -
1.7032 1010 1.0529 - - - - - -
1.7201 1020 0.3482 - - - - - -
1.7369 1030 1.2722 - - - - - -
1.7538 1040 0.2862 - - - - - -
1.7707 1050 0.5556 - - - - - -
1.7875 1060 0.3363 - - - - - -
1.8044 1070 0.3817 - - - - - -
1.8212 1080 0.787 - - - - - -
1.8381 1090 0.8169 - - - - - -
1.8550 1100 0.8241 - - - - - -
1.8718 1110 0.8071 - - - - - -
1.8887 1120 0.7825 - - - - - -
1.9056 1130 0.6786 - - - - - -
1.9224 1140 0.2086 - - - - - -
1.9393 1150 0.8414 - - - - - -
1.9562 1160 0.7762 - - - - - -
1.9730 1170 0.5421 - - - - - -
1.9899 1180 0.2344 - - - - - -
2.0 1186 - 0.7265 0.7258 0.7167 0.7084 0.7002 0.6757
2.0067 1190 0.1232 - - - - - -
2.0236 1200 0.473 - - - - - -
2.0405 1210 0.2913 - - - - - -
2.0573 1220 0.27 - - - - - -
2.0742 1230 0.33 - - - - - -
2.0911 1240 0.3323 - - - - - -
2.1079 1250 0.2355 - - - - - -
2.1248 1260 0.1089 - - - - - -
2.1417 1270 0.245 - - - - - -
2.1585 1280 0.4385 - - - - - -
2.1754 1290 0.3904 - - - - - -
2.1922 1300 0.4299 - - - - - -
2.2091 1310 0.1338 - - - - - -
2.2260 1320 0.2211 - - - - - -
2.2428 1330 0.2363 - - - - - -
2.2597 1340 0.0486 - - - - - -
2.2766 1350 0.1347 - - - - - -
2.2934 1360 0.1469 - - - - - -
2.3103 1370 0.064 - - - - - -
2.3272 1380 0.2582 - - - - - -
2.3440 1390 0.5994 - - - - - -
2.3609 1400 0.4847 - - - - - -
2.3777 1410 0.7184 - - - - - -
2.3946 1420 0.2852 - - - - - -
2.4115 1430 0.4838 - - - - - -
2.4283 1440 0.2932 - - - - - -
2.4452 1450 0.2452 - - - - - -
2.4621 1460 0.3531 - - - - - -
2.4789 1470 0.2666 - - - - - -
2.4958 1480 0.2835 - - - - - -
2.5126 1490 0.4196 - - - - - -
2.5295 1500 0.2563 - - - - - -
2.5464 1510 0.242 - - - - - -
2.5632 1520 0.4055 - - - - - -
2.5801 1530 0.489 - - - - - -
2.5970 1540 0.055 - - - - - -
2.6138 1550 0.6144 - - - - - -
2.6307 1560 0.9092 - - - - - -
2.6476 1570 0.6883 - - - - - -
2.6644 1580 0.4246 - - - - - -
2.6813 1590 0.317 - - - - - -
2.6981 1600 0.134 - - - - - -
2.7150 1610 0.2629 - - - - - -
2.7319 1620 0.3845 - - - - - -
2.7487 1630 0.4989 - - - - - -
2.7656 1640 0.5606 - - - - - -
2.7825 1650 0.0395 - - - - - -
2.7993 1660 0.2427 - - - - - -
2.8162 1670 0.1805 - - - - - -
2.8331 1680 0.1047 - - - - - -
2.8499 1690 0.717 - - - - - -
2.8668 1700 0.2244 - - - - - -
2.8836 1710 0.202 - - - - - -
2.9005 1720 0.2982 - - - - - -
2.9174 1730 0.1291 - - - - - -
2.9342 1740 0.3133 - - - - - -
2.9511 1750 0.1415 - - - - - -
2.9680 1760 0.2754 - - - - - -
2.9848 1770 0.5691 - - - - - -
3.0 1779 - 0.7298 0.7167 0.721 0.708 0.7126 0.692
3.0017 1780 0.0698 - - - - - -
3.0185 1790 0.3206 - - - - - -
3.0354 1800 0.3665 - - - - - -
3.0523 1810 0.0085 - - - - - -
3.0691 1820 0.2066 - - - - - -
3.0860 1830 0.3554 - - - - - -
3.1029 1840 0.2967 - - - - - -
3.1197 1850 0.0984 - - - - - -
3.1366 1860 0.4303 - - - - - -
3.1535 1870 0.1165 - - - - - -
3.1703 1880 0.1966 - - - - - -
3.1872 1890 0.1865 - - - - - -
3.2040 1900 0.386 - - - - - -
3.2209 1910 0.1836 - - - - - -
3.2378 1920 0.2119 - - - - - -
3.2546 1930 0.0979 - - - - - -
3.2715 1940 0.286 - - - - - -
3.2884 1950 0.1315 - - - - - -
3.3052 1960 0.32 - - - - - -
3.3221 1970 0.5843 - - - - - -
3.3390 1980 0.201 - - - - - -
3.3558 1990 0.3161 - - - - - -
3.3727 2000 0.1855 - - - - - -
3.3895 2010 0.0993 - - - - - -
3.4064 2020 0.2922 - - - - - -
3.4233 2030 0.3549 - - - - - -
3.4401 2040 0.0385 - - - - - -
3.4570 2050 0.3567 - - - - - -
3.4739 2060 0.2036 - - - - - -
3.4907 2070 0.666 - - - - - -
3.5076 2080 0.127 - - - - - -
3.5245 2090 0.1066 - - - - - -
3.5413 2100 0.1094 - - - - - -
3.5582 2110 0.0989 - - - - - -
3.5750 2120 0.1002 - - - - - -
3.5919 2130 0.0959 - - - - - -
3.6088 2140 0.479 - - - - - -
3.6256 2150 0.2854 - - - - - -
3.6425 2160 0.3548 - - - - - -
3.6594 2170 0.2801 - - - - - -
3.6762 2180 0.2012 - - - - - -
3.6931 2190 0.3343 - - - - - -
3.7099 2200 0.4601 - - - - - -
3.7268 2210 0.1198 - - - - - -
3.7437 2220 0.152 - - - - - -
3.7605 2230 0.0899 - - - - - -
3.7774 2240 0.2245 - - - - - -
3.7943 2250 0.4322 - - - - - -
3.8111 2260 0.1466 - - - - - -
3.8280 2270 0.2181 - - - - - -
3.8449 2280 0.441 - - - - - -
3.8617 2290 0.4819 - - - - - -
3.8786 2300 0.3004 - - - - - -
3.8954 2310 0.1952 - - - - - -
3.9123 2320 0.2417 - - - - - -
3.9292 2330 0.4047 - - - - - -
3.9460 2340 0.2326 - - - - - -
3.9629 2350 0.1564 - - - - - -
3.9798 2360 0.1566 - - - - - -
3.9966 2370 0.1281 - - - - - -
4.0 2372 - 0.7326 0.7268 0.7245 0.7120 0.7055 0.7055
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.47.1
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
21
Safetensors
Model size
568M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for fitlemon/bge-m3-uz-legal-matryoshka

Base model

BAAI/bge-m3
Finetuned
(197)
this model

Evaluation results