Add SetFit model

Browse files

Files changed (14) hide show

.gitattributes +2 -0
1_Pooling/config.json +10 -0
README.md +221 -0
config.json +26 -0
config_sentence_transformers.json +10 -0
config_setfit.json +4 -0
model.safetensors +3 -0
model_head.pkl +3 -0
modules.json +14 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +51 -0
tokenizer.json +3 -0
tokenizer_config.json +64 -0
unigram.json +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text
+unigram.json filter=lfs diff=lfs merge=lfs -text

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 384,
+  "pooling_mode_cls_token": false,
+  "pooling_mode_mean_tokens": true,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,221 @@

+---
+base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
+library_name: setfit
+metrics:
+- accuracy
+pipeline_tag: text-classification
+tags:
+- setfit
+- sentence-transformers
+- text-classification
+- generated_from_setfit_trainer
+widget:
+- text: 数字孪生包含三个核心元素：物理系统（产品、流程、网络）、代表它的虚拟模型以及实时更新模型的数据连接。虚拟模型反映了物理系统的当前状态和行为，并与来自传感器和物联网设备的数据持续同步。这种设置允许数字孪生模拟和预测物理系统在各种条件下的性能。将这三个组件结合在一起需要几项关键技术。首先，数据的收集和使用涉及云计算以及用于存储和处理的平台。其次，需要人工智能和机器学习来启用提供高级分析和准确虚拟模型的模拟模型。最后，增强现实和虚拟现实实现了数字模型和物理系统之间的高级可视化和交互。
+- text: 这项技术使 BMO 能够在模型内完成远程站点评估和操作测试，而不会中断服务。结果：15 个月内节省了 50 多万美元，在 503 个地点收回了 6,000
+    个调查小时，并集中了分行资源和文档。
+- text: 在银行业，数字孪生可能看起来像是增强的情景分析。如果您是这么想的，我们不会责怪您。但关键的区别就在这里：数据。传统情景分析依赖于静态数据，而数字孪生则使用实时动态数据并促进双向数据流。这意味着数字孪生可以利用其产生的洞察并触发更改以优化其复制的物理系统，而情景分析仅提供必须单独审查和采取行动的输出。
+- text: '## 数字孪生解决了什么问题？'
+- text: 数字孪生的使用始于 20 世纪 60 年代，当时 NASA 使用孪生模型在太空任务期间监控和调整航天器。最近，拜登政府宣布投资 2.85 亿美元用于半导体制造的数字孪生技术，因为该技术有潜力提高美国的效率、创新和弹性。
+inference: true
+model-index:
+- name: SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
+  results:
+  - task:
+      type: text-classification
+      name: Text Classification
+    dataset:
+      name: Unknown
+      type: unknown
+      split: test
+    metrics:
+    - type: accuracy
+      value: 0.8571428571428571
+      name: Accuracy
+---
+# SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
+This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
+The model has been trained using an efficient few-shot learning technique that involves:
+1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
+2. Training a classification head with features from the fine-tuned Sentence Transformer.
+## Model Details
+### Model Description
+- **Model Type:** SetFit
+- **Sentence Transformer body:** [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2)
+- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
+- **Maximum Sequence Length:** 128 tokens
+- **Number of Classes:** 14 classes
+<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
+- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
+- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
+### Model Labels
+| Label | Examples                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+|:------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| 6     | <ul><li>'## How do they work?'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+| 7     | <ul><li>'A digital twin comprises three core elements: the physical system (product, process, network), a virtual model representing it and a data connection that updates the model in real time. The virtual model mirrors the physical system’s current state and behavior, continuously synchronized with data from sensors and internet of things devices. This setup allows the digital twin to simulate and predict the physical system’s performance under various conditions. Bringing all three components together requires several key technologies. First, the collection and use of data involves cloud computing and platforms for storage and processing. Second, AI and machine learning are needed to enable simulation models that provide advanced analytics and accurate virtual models. Lastly, augmented reality and virtual reality enable advanced visualization and interactions between the digital model and the physical system.'</li></ul>                                                                                                                                                   |
+| 9     | <ul><li>'While data is the mantra of our modern age, data sets taken in isolation are of limited value because they tend to be sparse, noisy, and often indirect. Because systems exist across a web of components, any micro change results in a ripple effect, making accurately replicating a system extremely difficult. In banking, digital twin technology’s true potential is harnessed when integrated with a bank’s proprietary knowledge along with an inflow of external stimuli into decision-making models. With data flowing from multiple channels, using a mirrored environment enables precise contingency and incident response plans. When changes are made, other parts can adapt accordingly, simplifying coordination with business units and third parties. For example, a digital twin of a bank’s technology stack can predict outcomes of certain technology changes with the potential to evolve based on results from prior simulation runs. Digital twins can also mitigate risk across evolving fraud vectors through intelligent, comprehensive, data-driven strategic planning.'</li></ul> |
+| 3     | <ul><li>'So-called “digital twins” are dynamic, virtual replicas of complex systems. Organizations often use them for scenario planning because they blend real-world elements with simulations and a constant flow of data, helping evaluate the consequences of different decisions. For example, when BMO acquired 503 Bank of the West branches in 2023, it used Matterport’s capture services to create dimensionally accurate 3D digital twins of all the branch locations within three months.'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+| 0     | <ul><li>'https://bankingjournal.aba.com/2024/08/precision-banking-the-digital-twin-advantage/ dt'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+| 11    | <ul><li>'Let’s look at a few potential use cases for banks:'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
+| 13    | <ul><li>'**Digital financial twin.** This is an approach where digital twins could be used to precisely map financial and nonfinancial metrics across the life cycle of a bank product. The digital twin would be set up to link metrics related to the product’s service, partners, customers, and employees, resulting in efficient and quality decision-making. To go further, the digital twin would combine with real-time data from an enterprise resource planning system to ensure the highest level of resource optimization, drive sustainability and accelerate product development.'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+| 10    | <ul><li>'In the banking industry, digital twins may seem like enhanced scenario analysis. And if this is what you’re thinking, we don’t blame you. But here is where the key difference lies: data. Traditional scenario analysis relies on static data while digital twins use real-time dynamic data and facilitate bidirectional data flow. This means that a digital twin can take insights it produced and trigger changes to optimize the physical system it replicates, whereas scenario analysis merely provides an output that must be reviewed and acted upon separately.'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+| 5     | <ul><li>'The use of digital twins began in the 1960s when NASA used twin models to monitor and adjust spacecraft during space missions. Recently, the Biden administration announced a $285 million investment in digital twin technology for semiconductor manufacturing based on its potential to enhance efficiency, innovation, and resilience in the U.S.'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+| 2     | <ul><li>'img__'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+| 4     | <ul><li>'This technology enabled BMO to complete remote site assessments and operational tests within the model without service disruptions. The results: Over $500,000 saved in 15 months, 6,000 survey hours recouped across 503 locations, and branch resources and documentation centralized.'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
+| 12    | <ul><li>'**Stress testing.** A digital twin could enable banks to simulate various scenarios, such as economic downturns, market fluctuations, or operational disruptions, to assess their resilience and performance under stress. Banks could identify weaknesses and mitigate risks preemptively by inputting diverse parameters to the digital twin. Add real-time insights and your bank can continuously adjust strategies that bolster resilience and stability.'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+| 1     | <ul><li>'# Precision banking: The ‘digital twin’ advantage'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+| 8     | <ul><li>'## What problem do digital twins solve?'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+## Evaluation
+### Metrics
+| Label   | Accuracy |
+|:--------|:---------|
+| **all** | 0.8571   |
+## Uses
+### Direct Use for Inference
+First install the SetFit library:
+```bash
+pip install setfit
+```
+Then you can load this model and run inference.
+```python
+from setfit import SetFitModel
+# Download from the 🤗 Hub
+model = SetFitModel.from_pretrained("mikeee/setfit-model")
+# Run inference
+preds = model("## 数字孪生解决了什么问题？")
+```
+<!--
+### Downstream Use
+*List how someone could finetune this model on their own dataset.*
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Set Metrics
+| Training set | Min | Median  | Max |
+|:-------------|:----|:--------|:----|
+| Word count   | 1   | 50.7143 | 156 |
+| Label | Training Sample Count |
+|:------|:----------------------|
+| 0     | 1                     |
+| 1     | 1                     |
+| 2     | 1                     |
+| 3     | 1                     |
+| 4     | 1                     |
+| 5     | 1                     |
+| 6     | 1                     |
+| 7     | 1                     |
+| 8     | 1                     |
+| 9     | 1                     |
+| 10    | 1                     |
+| 11    | 1                     |
+| 12    | 1                     |
+| 13    | 1                     |
+### Training Hyperparameters
+- batch_size: (8, 8)
+- num_epochs: (1, 1)
+- max_steps: -1
+- sampling_strategy: oversampling
+- num_iterations: 4
+- body_learning_rate: (2e-05, 2e-05)
+- head_learning_rate: 2e-05
+- loss: CosineSimilarityLoss
+- distance_metric: cosine_distance
+- margin: 0.25
+- end_to_end: False
+- use_amp: False
+- warmup_proportion: 0.1
+- seed: 42
+- eval_max_steps: -1
+- load_best_model_at_end: False
+### Training Results
+| Epoch  | Step | Training Loss | Validation Loss |
+|:------:|:----:|:-------------:|:---------------:|
+| 0.1429 | 1    | 0.0039        | -               |
+### Framework Versions
+- Python: 3.10.12
+- SetFit: 1.0.3
+- Sentence Transformers: 3.0.1
+- Transformers: 4.39.0
+- PyTorch: 2.3.1+cu121
+- Datasets: 2.21.0
+- Tokenizers: 0.15.2
+## Citation
+### BibTeX
+```bibtex
+@article{https://doi.org/10.48550/arxiv.2209.11055,
+    doi = {10.48550/ARXIV.2209.11055},
+    url = {https://arxiv.org/abs/2209.11055},
+    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
+    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
+    title = {Efficient Few-Shot Learning Without Prompts},
+    publisher = {arXiv},
+    year = {2022},
+    copyright = {Creative Commons Attribution 4.0 International}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,26 @@

+{
+  "_name_or_path": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 384,
+  "initializer_range": 0.02,
+  "intermediate_size": 1536,
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.39.0",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 250037
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.0.1",
+    "transformers": "4.39.0",
+    "pytorch": "2.3.1+cu121"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": null
+}

config_setfit.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "normalize_embeddings": false,
+  "labels": null
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6e5cd4937246ff7e7ae80accd13e025e51fde55991035397d9610222775d8b66
+size 470637416

model_head.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9b853608c506394fd6484cf69e5e68e65e6fcf2293f5ed45da3bc4d4a6875434
+size 44071

modules.json ADDED Viewed

	@@ -0,0 +1,14 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 128,
+  "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,51 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "<mask>",
+    "lstrip": true,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fa685fc160bbdbab64058d4fc91b60e62d207e8dc60b9af5c002c5ab946ded00
+size 17083009

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,64 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "250001": {
+      "content": "<mask>",
+      "lstrip": true,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "<s>",
+  "do_lower_case": true,
+  "eos_token": "</s>",
+  "mask_token": "<mask>",
+  "max_length": 128,
+  "model_max_length": 128,
+  "pad_to_multiple_of": null,
+  "pad_token": "<pad>",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "</s>",
+  "stride": 0,
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "<unk>"
+}

unigram.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:da145b5e7700ae40f16691ec32a0b1fdc1ee3298db22a31ea55f57a966c4a65d
+size 14763260