--- license: other license_name: govtech-singapore license_link: LICENSE datasets: - gabrielchua/off-topic language: - en metrics: - roc_auc - f1 - precision - recall base_model: - jinaai/jina-embeddings-v2-small-en --- # Off-Topic Classification Model This repository contains a fine-tuned **Jina Embeddings model** designed to perform binary classification. The model predicts whether a user prompt is **off-topic** based on the intended purpose defined in the system prompt. ## Model Highlights - **Base Model**: [`jina-embeddings-v2-small-en`](https://huggingface.co/jinaai/jina-embeddings-v2-small-en) - **Maximum Context Length**: 1024 tokens - **Task**: Binary classification (on-topic/off-topic) ## Performance We evaluated our fine-tuned models on synthetic data modelling system and user prompt pairs reflecting real world enterprise use cases of LLMs. The dataset is available [here](https://huggingface.co/datasets/gabrielchua/off-topic). | Approach | Model | ROC-AUC | F1 | Precision | Recall | |---------------------------------------|--------------------------------|---------|------|-----------|--------| | [Fine-tuned bi-encoder classifier](https://huggingface.co/govtech/jina-embeddings-v2-small-en-off-topic) | jina-embeddings-v2-small-en | 0.99 | 0.97 | 0.99 | 0.95 | | 👉 [Fine-tuned cross-encoder classifier](https://huggingface.co/govtech/stsb-roberta-base-off-topic) | stsb-roberta-base | 0.99 | 0.99 | 0.99 | 0.99 | | Pre-trained cross-encoder | stsb-roberta-base | 0.73 | 0.68 | 0.53 | 0.93 | | Prompt Engineering | GPT 4o (2024-08-06) | - | 0.95 | 0.94 | 0.97 | | Prompt Engineering | GPT 4o Mini (2024-07-18) | - | 0.91 | 0.85 | 0.91 | | Zero-shot Classification | GPT 4o Mini (2024-07-18) | 0.99 | 0.97 | 0.95 | 0.99 | Further evaluation results on additional synthetic and external datasets (e.g.,`JailbreakBench`, `HarmBench`, `TrustLLM`) are available in our [technical report](https://arxiv.org/abs/2411.12946). ## Usage 1. Clone this repository and install the required dependencies: ```bash pip install -r requirements.txt ``` 2. You can run the model using two options: **Option 1**: Using `inference_onnx.py` with the ONNX Model. ``` python inference_onnx.py '[ ["System prompt example 1", "User prompt example 1"], ["System prompt example 2", "System prompt example 2] ]' ``` **Option 2**: Using `inference_safetensors.py` with PyTorch and SafeTensors. ``` python inference_safetensors.py '[ ["System prompt example 1", "User prompt example 1"], ["System prompt example 2", "System prompt example 2] ]' ``` Read more about this model in our [technical report](https://arxiv.org/abs/2411.12946).