SetFit with BAAI/bge-large-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-large-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-large-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 7 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
Aggregation	'Show me median Intangible Assets' 'Can I have sum Cost_Entertainment?' 'Get me min RevenueVariance_Actual_vs_Forecast.'
Lookup_1	'Show me data_asset_kpi_cf details.' 'Retrieve data_asset_kpi_cf details.' 'Show M&A deal size by sector.'
Viewtables	'What tables are included in the starhub_data_asset database that are required for performing a basic data analysis?' 'What is the full list of tables available for use in queries within the starhub_data_asset database?' 'What are the table names within the starhub_data_asset database that enable data analysis of customer feedback?'
Tablejoin	'Is it possible to merge the Employees and Orders tables to see which employee handled each order?' 'Join data_asset_001_ta with data_asset_kpi_cf.' 'How can I connect the Customers and Orders tables to find customers who made purchases during a specific promotion?'
Lookup	'Filter by customers who have placed more than 3 orders and get me their email addresses.' "Filter by customers in the city 'New York' and show me their phone numbers." "Can you filter by employees who work in the 'Research' department?"
Generalreply	"Oh, I just stepped outside and it's actually quite lovely! The sun is shining and there's a light breeze. How about you?" "One of my short-term goals is to learn a new skill, like coding or cooking. I also want to save up enough money for a weekend trip with friends. How about you, any short-term goals you're working towards?" 'Hey! My day is going pretty well, thanks for asking. How about yours?'
Rejection	'I have no interest in generating more data.' "I don't want to engage in filtering operations." "I'd rather not filter this dataset."

Evaluation

Metrics

Label	Accuracy
all	0.9818

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("nazhan/bge-large-en-v1.5-brahmaputra-iter-10-3rd")
# Run inference
preds = model("what do you think it is?")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	8.7137	62

Label	Training Sample Count
Tablejoin	128
Rejection	73
Aggregation	222
Lookup	55
Generalreply	75
Viewtables	76
Lookup_1	157

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: 2450
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0000	1	0.2001	-
0.0022	50	0.1566	-
0.0045	100	0.0816	-
0.0067	150	0.0733	-
0.0089	200	0.0075	-
0.0112	250	0.0059	-
0.0134	300	0.0035	-
0.0156	350	0.0034	-
0.0179	400	0.0019	-
0.0201	450	0.0015	-
0.0223	500	0.0021	-
0.0246	550	0.003	-
0.0268	600	0.0021	-
0.0290	650	0.0011	-
0.0313	700	0.0015	-
0.0335	750	0.0011	-
0.0357	800	0.001	-
0.0380	850	0.001	-
0.0402	900	0.0012	-
0.0424	950	0.0012	-
0.0447	1000	0.0011	-
0.0469	1050	0.0008	-
0.0491	1100	0.0009	-
0.0514	1150	0.001	-
0.0536	1200	0.0008	-
0.0558	1250	0.0011	-
0.0581	1300	0.0009	-
0.0603	1350	0.001	-
0.0625	1400	0.0007	-
0.0647	1450	0.0008	-
0.0670	1500	0.0007	-
0.0692	1550	0.001	-
0.0714	1600	0.0007	-
0.0737	1650	0.0007	-
0.0759	1700	0.0006	-
0.0781	1750	0.0008	-
0.0804	1800	0.0006	-
0.0826	1850	0.0005	-
0.0848	1900	0.0006	-
0.0871	1950	0.0005	-
0.0893	2000	0.0007	-
0.0915	2050	0.0005	-
0.0938	2100	0.0006	-
0.0960	2150	0.0007	-
0.0982	2200	0.0005	-
0.1005	2250	0.0008	-
0.1027	2300	0.0005	-
0.1049	2350	0.0008	-
0.1072	2400	0.0007	-
0.1094	2450	0.0007	0.0094

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.11.9
SetFit: 1.0.3
Sentence Transformers: 2.7.0
Transformers: 4.42.4
PyTorch: 2.4.0+cu121
Datasets: 2.21.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

nazhan
/

bge-large-en-v1.5-brahmaputra-iter-10-3rd