SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model trained on the rbojja/labelled_bank_support_dataset dataset that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-small-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 10 classes
Training Dataset: rbojja/labelled_bank_support_dataset

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1	'Can you explain the differences between fixed and variable interest rates for personal loans?' 'Can you clarify why I was charged a fee for my account this month?' 'How can I access your customer service features if I need assistance?'
2	'Could you please verify that my deposit cancellation has been completed?' "I've noticed a discrepancy in my balance after my latest deposit. Can you confirm if it was processed correctly?" 'Could you verify that my recent payment has been reversed successfully?'
3	"How do changes in the central bank's interest rates affect the interest I earn on my savings account?" 'Can I share my success story about earning rewards for referring friends? The incentives really helped me and I think others should know!' 'After my recent loan denial, how can I strengthen my reapplication to improve my approval odds?'
10	"I just made a payment, but I'm not sure if it's been processed yet. Can you check for me?" 'I think there might be an error with my account balance; can you show me my most recent transactions?' 'I expected my payment to be completed by now, can you check the status for me?'
9	'Can you please pass on my thanks to the customer service representative who helped me with my account security concerns? Their support made all the difference!' 'I just wanted to say thank you for the quick help with my issue!' 'I want to express my appreciation for the security alerts I received. It’s nice to know my account is being protected so well!'
4	"I'm ending my account with you; can you provide a summary of my final transaction?"
7	'Can I leave a review about my downgrade process? I have some suggestions for improvement.' 'Can you send me a notification of all my completed payments for this month?' 'Can you tell me how I can benefit from any investment opportunities with your bank?'
6	'I just checked my banking statement and I don’t recognize this last charge. How can I contest that?' "I believe I've been incorrectly charged for a subscription service. What steps do I take?" 'I appreciate the service you provide, but the support response time during the outage was unacceptable.'
0	'I’m sorry, but I need to check on a transaction that was denied because of insufficient funds. Can you help me resolve this situation?' "I'm really sorry, but I still can't access my account even after resetting my password. What should I do next?" "I've been mistakenly locked out of my account and I feel bad about it. Can you assist me in regaining access?"
8	'I recently used your services and I’m really satisfied. Is there a way to share my thoughts on my overall banking experience?' 'I think it would be great if final account statements could be sent out at the beginning of the month instead of the end. It allows for better planning and review.' 'I wanted to share that I found the loan application submission very straightforward. When can I expect to hear back about my approval?'

Evaluation

Metrics

Label	Accuracy
all	0.8641

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("rbojja/ft-intent-bank")
# Run inference
preds = model("I need to know the outstanding amount on my education loan.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	7	15.322	31

Label	Training Sample Count
0	7
1	797
2	29
3	18
4	1
6	15
7	7
8	6
9	63
10	57

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 16)
max_steps: 3450
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.2179	-
0.0145	50	0.2598	-
0.0290	100	0.2349	-
0.0435	150	0.2019	-
0.0580	200	0.1686	-
0.0725	250	0.1375	-
0.0870	300	0.1265	-
0.1014	350	0.0954	-
0.1159	400	0.0794	-
0.1304	450	0.065	-
0.1449	500	0.0731	-
0.1594	550	0.0547	-
0.1739	600	0.043	-
0.1884	650	0.0327	-
0.2029	700	0.027	-
0.2174	750	0.0285	-
0.2319	800	0.0201	-
0.2464	850	0.0151	-
0.2609	900	0.0131	-
0.2754	950	0.0076	-
0.2899	1000	0.0147	-
0.3043	1050	0.0122	-
0.3188	1100	0.0109	-
0.3333	1150	0.0126	-
0.3478	1200	0.0108	-
0.3623	1250	0.009	-
0.3768	1300	0.0072	-
0.3913	1350	0.0051	-
0.4058	1400	0.0057	-
0.4203	1450	0.0056	-
0.4348	1500	0.0079	-
0.4493	1550	0.0076	-
0.4638	1600	0.0029	-
0.4783	1650	0.0039	-
0.4928	1700	0.003	-
0.5072	1750	0.0037	-
0.5217	1800	0.0022	-
0.5362	1850	0.0032	-
0.5507	1900	0.0034	-
0.5652	1950	0.006	-
0.5797	2000	0.0046	-
0.5942	2050	0.0026	-
0.6087	2100	0.0031	-
0.6232	2150	0.0041	-
0.6377	2200	0.0049	-
0.6522	2250	0.0015	-
0.6667	2300	0.0053	-
0.6812	2350	0.0033	-
0.6957	2400	0.0055	-
0.7101	2450	0.0044	-
0.7246	2500	0.0036	-
0.7391	2550	0.0038	-
0.7536	2600	0.0038	-
0.7681	2650	0.0027	-
0.7826	2700	0.0028	-
0.7971	2750	0.0038	-
0.8116	2800	0.0033	-
0.8261	2850	0.0035	-
0.8406	2900	0.002	-
0.8551	2950	0.0034	-
0.8696	3000	0.0053	-
0.8841	3050	0.0035	-
0.8986	3100	0.0016	-
0.9130	3150	0.0021	-
0.9275	3200	0.0021	-
0.9420	3250	0.005	-
0.9565	3300	0.0031	-
0.9710	3350	0.0038	-
0.9855	3400	0.0029	-
1.0	3450	0.0019	-

Framework Versions

Python: 3.10.12
SetFit: 1.1.1
Sentence Transformers: 3.3.1
Transformers: 4.47.1
PyTorch: 2.5.1+cu124
Datasets: 3.2.0
Tokenizers: 0.21.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

rbojja
/

ft-intent-bank