File size: 22,759 Bytes
4b9a244 a0fe883 4b9a244 a0fe883 4b9a244 a0fe883 4b9a244 a0fe883 4b9a244 a0fe883 4b9a244 a0fe883 4b9a244 a0fe883 4b9a244 a0fe883 4b9a244 a0fe883 4b9a244 a0fe883 4b9a244 a0fe883 4b9a244 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 |
---
library_name: setfit
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
metrics:
- accuracy
- precision
- recall
- f1
widget:
- text: 'I''m trying to take a dataframe and convert them to tensors to train a model
in keras. I think it''s being triggered when I am converting my Y label to a tensor:
I''m getting the following error when casting y_train to tensor from slices: In
the tutorials this seems to work but I think those tutorials are doing multiclass
classifications whereas I''m doing a regression so y_train is a series not multiple
columns. Any suggestions of what I can do?'
- text: My weights are defined as I want to use the weights decay so I add, for example,
the argument to the tf.get_variable. Now I'm wondering if during the evaluation
phase this is still correct or maybe I have to set the regularizer factor to 0.
There is also another argument trainable. The documentation says If True also
add the variable to the graph collection GraphKeys.TRAINABLE_VARIABLES. which
is not clear to me. Should I use it? Can someone explain to me if the weights
decay effects in a sort of wrong way the evaluation step? How can I solve in that
case?
- text: 'Maybe I''m confused about what "inner" and "outer" tensor dimensions are,
but the documentation for tf.matmul puzzles me: Isn''t it the case that R-rank
arguments need to have matching (or no) R-2 outer dimensions, and that (as in
normal matrix multiplication) the Rth, inner dimension of the first argument must
match the R-1st dimension of the second. That is, in The outer dimensions a, ...,
z must be identical to a'', ..., z'' (or not exist), and x and x'' must match
(while p and q can be anything). Or put another way, shouldn''t the docs say:'
- text: 'I am using tf.data with reinitializable iterator to handle training and dev
set data. For each epoch, I initialize the training data set. The official documentation
has similar structure. I think this is not efficient especially if the training
set is large. Some of the resources I found online has sess.run(train_init_op,
feed_dict={X: X_train, Y: Y_train}) before the for loop to avoid this issue. But
then we can''t process the dev set after each epoch; we can only process it after
we are done iterating over epochs epochs. Is there a way to efficiently process
the dev set after each epoch?'
- text: 'Why is the pred variable being calculated before any of the training iterations
occur? I would expect that a pred would be generated (through the RNN() function)
during each pass through of the data for every iteration? There must be something
I am missing. Is pred something like a function object? I have looked at the docs
for tf.matmul() and that returns a tensor, not a function. Full source: https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/recurrent_network.py
Here is the code:'
pipeline_tag: text-classification
inference: true
base_model: flax-sentence-embeddings/stackoverflow_mpnet-base
model-index:
- name: SetFit with flax-sentence-embeddings/stackoverflow_mpnet-base
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: accuracy
value: 0.81875
name: Accuracy
- type: precision
value: 0.8248924988055423
name: Precision
- type: recall
value: 0.81875
name: Recall
- type: f1
value: 0.8178892421209625
name: F1
---
# SetFit with flax-sentence-embeddings/stackoverflow_mpnet-base
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [flax-sentence-embeddings/stackoverflow_mpnet-base](https://huggingface.co/flax-sentence-embeddings/stackoverflow_mpnet-base) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned Sentence Transformer.
## Model Details
### Model Description
- **Model Type:** SetFit
- **Sentence Transformer body:** [flax-sentence-embeddings/stackoverflow_mpnet-base](https://huggingface.co/flax-sentence-embeddings/stackoverflow_mpnet-base)
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
- **Maximum Sequence Length:** 512 tokens
- **Number of Classes:** 2 classes
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
### Model Labels
| Label | Examples |
|:------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1 | <ul><li>'In tf.gradients, there is a keyword argument grad_ys Why is grads_ys needed here? The docs here is implicit. Could you please give some specific purpose and code? And my example code for tf.gradients is'</li><li>'I am coding a Convolutional Neural Network to classify images in TensorFlow but there is a problem: When I try to feed my NumPy array of flattened images (3 channels with RGB values from 0 to 255) to a tf.estimator.inputs.numpy_input_fn I get the following error: My numpy_imput_fn looks like this: In the documentation for the function it is said that x should be a dict of NumPy array:'</li><li>'I am trying to use tf.pad. Here is my attempt to pad the tensor to length 20, with values 10. I get this error message I am looking at the documentation https://www.tensorflow.org/api_docs/python/tf/pad But I am unable to figure out how to shape the pad value'</li></ul> |
| 0 | <ul><li>"I am trying to use tf.train.shuffle_batch to consume batches of data from a TFRecord file using TensorFlow 1.0. The relevant functions are: The code enters through examine_batches(), having been handed the output of batch_generator(). batch_generator() calls tfrecord_to_graph_ops() and the problem is in that function, I believe. I am calling on a file with 1,000 bytes (numbers 0-9). If I call eval() on this in a Session, it shows me all 1,000 elements. But if I try to put it in a batch generator, it crashes. If I don't reshape targets, I get an error like ValueError: All shapes must be fully defined when tf.train.shuffle_batch is called. If I call targets.set_shape([1]), reminiscent of Google's CIFAR-10 example code, I get an error like Invalid argument: Shape mismatch in tuple component 0. Expected [1], got [1000] in tf.train.shuffle_batch. I also tried using tf.strided_slice to cut a chunk of the raw data - this doesn't crash but it results in just getting the first event over and over again. What is the right way to do this? To pull batches from a TFRecord file? Note, I could manually write a function that chopped up the raw byte data and did some sort of batching - especially easy if I am using the feed_dict approach to getting data into the graph - but I am trying to learn how to use TensorFlow's TFRecord files and how to use their built in batching functions. Thanks!"</li><li>"I am fairly new to TF and ML in general, so I have relied heavily on the documentation and tutorials provided by TF. I have been following along with the Tensorflow 2.0 Objection Detection API tutorial to the letter and have encountered an issue while training: everytime I run the training script model_main_tf2.py, it always hangs after the output: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2) after a number of depreciation warnings. I have tried many different ways of fixing this, including modifying the train script and pipeline.config files. My dataset isn't very large, less than 100 images with a max of 15 labels per image. useful info: Python 3.8.0 Tensorflow 2.4.4 (Non GPU) Windows 10 Pro Any and all help is appreciated!"</li><li>'I found two solutions to calculate FLOPS of Keras models (TF 2.x): [1] https://github.com/tensorflow/tensorflow/issues/32809#issuecomment-849439287 [2] https://github.com/tensorflow/tensorflow/issues/32809#issuecomment-841975359 At first glance, both seem to work perfectly when testing with tf.keras.applications.ResNet50(). The resulting FLOPS are identical and correspond to the FLOPS of the ResNet paper. But then I built a small GRU model and found different FLOPS for the two methods: This results in the following numbers: 13206 for method [1] and 18306 for method [2]. That is really confusing... Does anyone know how to correctly calculate FLOPS of recurrent Keras models in TF 2.x? EDIT I found another information: [3] https://github.com/tensorflow/tensorflow/issues/36391#issuecomment-596055100 When adding this argument to convert_variables_to_constants_v2, the outputs of [1] and [2] are the same when using my GRU example. The tensorflow documentation explains this argument as follows (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/framework/convert_to_constants.py): Can someone try to explain this?'</li></ul> |
## Evaluation
### Metrics
| Label | Accuracy | Precision | Recall | F1 |
|:--------|:---------|:----------|:-------|:-------|
| **all** | 0.8187 | 0.8249 | 0.8187 | 0.8179 |
## Uses
### Direct Use for Inference
First install the SetFit library:
```bash
pip install setfit
```
Then you can load this model and run inference.
```python
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("sharukat/so_mpnet-base_question_classifier")
# Run inference
preds = model("I'm trying to take a dataframe and convert them to tensors to train a model in keras. I think it's being triggered when I am converting my Y label to a tensor: I'm getting the following error when casting y_train to tensor from slices: In the tutorials this seems to work but I think those tutorials are doing multiclass classifications whereas I'm doing a regression so y_train is a series not multiple columns. Any suggestions of what I can do?")
```
<!--
### Downstream Use
*List how someone could finetune this model on their own dataset.*
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Set Metrics
| Training set | Min | Median | Max |
|:-------------|:----|:---------|:----|
| Word count | 12 | 128.0219 | 907 |
| Label | Training Sample Count |
|:------|:----------------------|
| 0 | 320 |
| 1 | 320 |
### Training Hyperparameters
- batch_size: (8, 8)
- num_epochs: (1, 16)
- max_steps: -1
- sampling_strategy: unique
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- max_length: 256
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: True
### Training Results
| Epoch | Step | Training Loss | Validation Loss |
|:-------:|:---------:|:-------------:|:---------------:|
| 0.0000 | 1 | 0.3266 | - |
| **1.0** | **25640** | **0.0** | **0.2863** |
* The bold row denotes the saved checkpoint.
### Framework Versions
- Python: 3.10.13
- SetFit: 1.0.3
- Sentence Transformers: 2.5.1
- Transformers: 4.38.1
- PyTorch: 2.1.2
- Datasets: 2.18.0
- Tokenizers: 0.15.2
## Citation
### BibTeX
```bibtex
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--> |