Cat Embeddings

A set of embedding model trained for study embedding quality vs model architecture (width/depth) given a size constraint (12M params).

  • cat-emb-2-128: 2 layers/hidden size 128/4.4m
  • cat-emb-4-128: 4 layers/H 128/4.8m
  • cat-emb-8-128: 8 layers/H 128/5.6m
  • cat-emb-12-128: 12 layers/H 128/6.4m
  • cat-emb-2-256: 2 layers/H 256/9.7m
  • cat-emb-4-256: 4 layers/H 256/11.3m

Training

  • stage 1: seq 192, batch size 2048, 50k steps, sentence pairs.
  • stage 2: seq 512, batch size 64, 5k steps, sentence triplets.

Perf

MRL dim\Task BIOSSES SICK-R STS12 STS13 STS14 STS15 STS16 STSB SummEval
128 0.7107 0.7126 0.6815 0.7343 0.7038 0.8163 0.7495 0.7652 0.2958
64 0.713 0.7123 0.6829 0.7348 0.7008 0.813 0.7475 0.7609 0.2861
32 0.6714 0.7094 0.6847 0.7345 0.6911 0.7989 0.7385 0.7545 0.3106
16 0.6637 0.697 0.669 0.7096 0.6665 0.7589 0.7183 0.7307 0.3164
Downloads last month
162
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.