SetFit v1.0.0 Migration Guide
To update your code to work with v1.0.0, the following changes must be made:
General Migration Guide
keep_body_frozen
fromSetFitModel.unfreeze
has been deprecated, simply either pass"head"
,"body"
or no arguments to unfreeze both.SupConLoss
has been moved fromsetfit.modeling
tosetfit.losses
. If you are importing it usingfrom setfit.modeling import SupConLoss
, then import it likefrom setfit import SupConLoss
now instead.use_auth_token
has been renamed totoken
inSetFitModel.from_pretrained()
.use_auth_token
will keep working until the next major version, but with a warning.
Training Migration Guide
Replace all uses of
SetFitTrainer
with Trainer, and all uses ofDistillationSetFitTrainer
with DistillationTrainer.Remove
num_iterations
,num_epochs
,learning_rate
,batch_size
,seed
,use_amp
,warmup_proportion
,distance_metric
,margin
,samples_per_label
andloss_class
from aTrainer
initialization, and move them to aTrainerArguments
initialization instead. This instance should then be passed to the trainer via theargs
argument.num_iterations
has been deprecated, the number of training steps should now be controlled vianum_epochs
,max_steps
orEarlyStoppingCallback
.learning_rate
has been split up intobody_learning_rate
andhead_learning_rate
.loss_class
has been renamed toloss
.
Stop providing training arguments like
num_epochs
directly toTrainer.train
: pass aTrainingArguments
instance via theargs
argument instead.Refactor multiple
trainer.train()
,trainer.freeze()
andtrainer.unfreeze()
calls that were previously necessary to train the differentiable head into just onetrainer.train()
call by settingbatch_size
andnum_epochs
on theTrainingArguments
dataclass with tuples. The first value in the tuple is for training the embeddings, and the second is for training the classifier.
Hard deprecations
SetFitBaseModel
,SKLearnWrapper
andSetFitPipeline
have been removed. These can no longer be used starting from v1.0.0.
v1.0.0 Changelog
This list contains new functionality that can be used starting from v1.0.0.
SetFitModel.from_pretrained()
now accepts new arguments:device
: Specifies the device on which to load the SetFit model.labels
: Specify labels corresponding to the training labels - useful if the training labels are integers ranging from0
tonum_classes - 1
. These are automatically applied on calling SetFitModel.predict().model_card_data
: Provide a SetFitModelCardData instance storing data such as model language, license, dataset name, etc. to be used in the automatically generated model cards.
Certain SetFit configuration options, such as the new
labels
argument fromSetFitModel.from_pretrained()
, now get saved inconfig_setfit.json
files when a model is saved. This allowslabels
to be automatically fetched when a model is loaded.SetFitModel.predict() now accepts new arguments:
batch_size
(defaults to32
): The batch size to use in encoding the sentences to embeddings. Higher often means faster processing but higher memory usage.use_labels
(defaults toTrue
): Whether to use theSetFitModel.labels
to convert integer labels to string labels. Not used if the training labels are already strings.
SetFitModel.encode() has been introduce to convert input sentences to embeddings using the
SentenceTransformer
body.SetFitModel.device has been introduced to determine the device of the model.
AbsaTrainer and AbsaModel have been introduced for applying SetFit for Aspect Based Sentiment Analysis.
Trainer now supports a
callbacks
argument for a list oftransformers
TrainerCallback
instances.- By default, all installed callbacks integrated with
transformers
are supported, includingTensorBoardCallback
,WandbCallback
to log training logs to TensorBoard and W&B, respectively. - The Trainer will now print
embedding_loss
in the terminal, as well aseval_embedding_loss
ifeval_strategy
is set to"epoch"
or"steps"
in TrainingArguments.
- By default, all installed callbacks integrated with
Trainer.evaluate() now works with string labels.
An updated contrastive pair sampler increases the variety of training pairs.
TrainingArguments supports various new arguments:
output_dir
: The output directory where the model predictions and checkpoints will be written.max_steps
: If set to a positive number, the total number of training steps to perform. Overrides num_epochs. The training may stop before reaching the set number of steps when all data is exhausted.sampling_strategy
: The sampling strategy of how to draw pairs in training. Possible values are:"oversampling"
: Draws even number of positive/negative sentence pairs until every sentence pair has been drawn."undersampling"
: Draws the minimum number of positive/negative sentence pairs until every sentence pair in the minority class has been drawn."unique"
: Draws every sentence pair combination (likely resulting in unbalanced number of positive/negative sentence pairs).
The default is set to
"oversampling"
, ensuring all sentence pairs are drawn at least once. Alternatively, settingnum_iterations
will override this argument and determine the number of generated sentence pairs.report_to
: The list of integrations to report the results and logs to. Supported platforms are"azure_ml"
,"comet_ml"
,"mlflow"
,"neptune"
,"tensorboard"
,"clearml"
and"wandb"
. Use"all"
to report to all integrations installed,"none"
for no integrations.run_name
: A descriptor for the run. Typically used for wandb and mlflow logging.logging_strategy
: The logging strategy to adopt during training. Possible values are:"no"
: No logging is done during training."epoch"
: Logging is done at the end of each epoch."steps"
: Logging is done everylogging_steps
.
logging_first_step
: Whether to log and evaluate the firstglobal_step
or not.logging_steps
: Number of update steps between two logs iflogging_strategy="steps"
.eval_strategy
: The evaluation strategy to adopt during training. Possible values are:"no"
: No evaluation is done during training."steps"
: Evaluation is done (and logged) everyeval_steps
."epoch"
: Evaluation is done at the end of each epoch.
eval_steps
: Number of update steps between two evaluations ifeval_strategy="steps"
. Will default to the same aslogging_steps
if not set.eval_delay
: Number of epochs or steps to wait for before the first evaluation can be performed, depending on theeval_strategy
.eval_max_steps
: If set to a positive number, the total number of evaluation steps to perform. The evaluation may stop before reaching the set number of steps when all data is exhausted.save_strategy
: The checkpoint save strategy to adopt during training. Possible values are:"no"
: No save is done during training."epoch"
: Save is done at the end of each epoch."steps"
: Save is done everysave_steps
.
save_steps
: Number of updates steps before two checkpoint saves ifsave_strategy="steps"
.save_total_limit
: If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints inoutput_dir
. Note, the best model is always preserved if theeval_strategy
is not"no"
.load_best_model_at_end
: Whether or not to load the best model found during training at the end of training.When set to
True
, the parameterssave_strategy
needs to be the same aseval_strategy
, and in the case it is “steps”,save_steps
must be a round multiple ofeval_steps
.
Pushing SetFit or SetFitABSA models to the Hub with
SetFitModel.push_to_hub()
or AbsaModel.push_to_hub() now results in a detailed model card. As an example, see this SetFitModel or this SetFitABSA polarity model.