AutoTrain documentation

Quickstart with Python

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.8.24).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Quickstart with Python

AutoTrain is a library that allows you to train state of the art models on Hugging Face Spaces, or locally. It provides a simple and easy-to-use interface to train models for various tasks like llm finetuning, text classification, image classification, object detection, and more.

In this quickstart guide, we will show you how to train a model using AutoTrain in Python.

Getting Started

AutoTrain can be installed using pip:

$ pip install autotrain-advanced

The example code below shows how to finetune an LLM model using AutoTrain in Python:

import os

from autotrain.params import LLMTrainingParams
from autotrain.project import AutoTrainProject


params = LLMTrainingParams(
    model="meta-llama/Llama-3.2-1B-Instruct",
    data_path="HuggingFaceH4/no_robots",
    chat_template="tokenizer",
    text_column="messages",
    train_split="train",
    trainer="sft",
    epochs=3,
    batch_size=1,
    lr=1e-5,
    peft=True,
    quantization="int4",
    target_modules="all-linear",
    padding="right",
    optimizer="paged_adamw_8bit",
    scheduler="cosine",
    gradient_accumulation=8,
    mixed_precision="bf16",
    merge_adapter=True,
    project_name="autotrain-llama32-1b-finetune",
    log="tensorboard",
    push_to_hub=True,
    username=os.environ.get("HF_USERNAME"),
    token=os.environ.get("HF_TOKEN"),
)


backend = "local"
project = AutoTrainProject(params=params, backend=backend, process=True)
project.create()

In this example, we are finetuning the meta-llama/Llama-3.2-1B-Instruct model on the HuggingFaceH4/no_robots dataset. We are training the model for 3 epochs with a batch size of 1 and a learning rate of 1e-5. We are using the paged_adamw_8bit optimizer and the cosine scheduler. We are also using mixed precision training with a gradient accumulation of 8. The final model will be pushed to the Hugging Face Hub after training.

To train the model, run the following command:

$ export HF_USERNAME=<your-hf-username>
$ export HF_TOKEN=<your-hf-write-token>
$ python train.py

This will create a new project directory with the name autotrain-llama32-1b-finetune and start the training process. Once the training is complete, the model will be pushed to the Hugging Face Hub.

Your HF_TOKEN and HF_USERNAME are only required if you want to push the model or if you are accessing a gated model or dataset.

AutoTrainProject Class

class autotrain.project.AutoTrainProject

< >

( params: Union backend: str process: bool = False )

A class to train an AutoTrain project

Attributes

params : Union[ LLMTrainingParams, TextClassificationParams, TabularParams, DreamBoothTrainingParams, Seq2SeqParams, ImageClassificationParams, TextRegressionParams, ObjectDetectionParams, TokenClassificationParams, SentenceTransformersParams, ImageRegressionParams, ExtractiveQuestionAnsweringParams, VLMTrainingParams, ] The parameters for the AutoTrain project. backend : str The backend to be used for the AutoTrain project. It should be one of the following:

  • local
  • spaces-a10g-large
  • spaces-a10g-small
  • spaces-a100-large
  • spaces-t4-medium
  • spaces-t4-small
  • spaces-cpu-upgrade
  • spaces-cpu-basic
  • spaces-l4x1
  • spaces-l4x4
  • spaces-l40sx1
  • spaces-l40sx4
  • spaces-l40sx8
  • spaces-a10g-largex2
  • spaces-a10g-largex4 process : bool Flag to indicate if the params and dataset should be processed. If your data format is not AutoTrain-readable, set it to True. Set it to True when in doubt. Defaults to False.

Methods

post_init(): Validates the backend attribute. create(): Creates a runner based on the backend and initializes the AutoTrain project.

Parameters

Text Tasks

class autotrain.trainers.clm.params.LLMTrainingParams

< >

( model: str = 'gpt2' project_name: str = 'project-name' data_path: str = 'data' train_split: str = 'train' valid_split: Optional = None add_eos_token: bool = True block_size: Union = -1 model_max_length: int = 2048 padding: Optional = 'right' trainer: str = 'default' use_flash_attention_2: bool = False log: str = 'none' disable_gradient_checkpointing: bool = False logging_steps: int = -1 eval_strategy: str = 'epoch' save_total_limit: int = 1 auto_find_batch_size: bool = False mixed_precision: Optional = None lr: float = 3e-05 epochs: int = 1 batch_size: int = 2 warmup_ratio: float = 0.1 gradient_accumulation: int = 4 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 chat_template: Optional = None quantization: Optional = 'int4' target_modules: Optional = 'all-linear' merge_adapter: bool = False peft: bool = False lora_r: int = 16 lora_alpha: int = 32 lora_dropout: float = 0.05 model_ref: Optional = None dpo_beta: float = 0.1 max_prompt_length: int = 128 max_completion_length: Optional = None prompt_text_column: Optional = None text_column: str = 'text' rejected_text_column: Optional = None push_to_hub: bool = False username: Optional = None token: Optional = None unsloth: bool = False distributed_backend: Optional = None )

Parameters

  • model (str) — Model name to be used for training. Default is “gpt2”.
  • project_name (str) — Name of the project and output directory. Default is “project-name”.
  • data_path (str) — Path to the dataset. Default is “data”.
  • train_split (str) — Configuration for the training data split. Default is “train”.
  • valid_split (Optional[str]) — Configuration for the validation data split. Default is None.
  • add_eos_token (bool) — Whether to add an EOS token at the end of sequences. Default is True.
  • block_size (Union[int, List[int]]) — Size of the blocks for training, can be a single integer or a list of integers. Default is -1.
  • model_max_length (int) — Maximum length of the model input. Default is 2048.
  • padding (Optional[str]) — Side on which to pad sequences (left or right). Default is “right”.
  • trainer (str) — Type of trainer to use. Default is “default”.
  • use_flash_attention_2 (bool) — Whether to use flash attention version 2. Default is False.
  • log (str) — Logging method for experiment tracking. Default is “none”.
  • disable_gradient_checkpointing (bool) — Whether to disable gradient checkpointing. Default is False.
  • logging_steps (int) — Number of steps between logging events. Default is -1.
  • eval_strategy (str) — Strategy for evaluation (e.g., ‘epoch’). Default is “epoch”.
  • save_total_limit (int) — Maximum number of checkpoints to keep. Default is 1.
  • auto_find_batch_size (bool) — Whether to automatically find the optimal batch size. Default is False.
  • mixed_precision (Optional[str]) — Type of mixed precision to use (e.g., ‘fp16’, ‘bf16’, or None). Default is None.
  • lr (float) — Learning rate for training. Default is 3e-5.
  • epochs (int) — Number of training epochs. Default is 1.
  • batch_size (int) — Batch size for training. Default is 2.
  • warmup_ratio (float) — Proportion of training to perform learning rate warmup. Default is 0.1.
  • gradient_accumulation (int) — Number of steps to accumulate gradients before updating. Default is 4.
  • optimizer (str) — Optimizer to use for training. Default is “adamw_torch”.
  • scheduler (str) — Learning rate scheduler to use. Default is “linear”.
  • weight_decay (float) — Weight decay to apply to the optimizer. Default is 0.0.
  • max_grad_norm (float) — Maximum norm for gradient clipping. Default is 1.0.
  • seed (int) — Random seed for reproducibility. Default is 42.
  • chat_template (Optional[str]) — Template for chat-based models, options include: None, zephyr, chatml, or tokenizer. Default is None.
  • quantization (Optional[str]) — Quantization method to use (e.g., ‘int4’, ‘int8’, or None). Default is “int4”.
  • target_modules (Optional[str]) — Target modules for quantization or fine-tuning. Default is “all-linear”.
  • merge_adapter (bool) — Whether to merge the adapter layers. Default is False.
  • peft (bool) — Whether to use Parameter-Efficient Fine-Tuning (PEFT). Default is False.
  • lora_r (int) — Rank of the LoRA matrices. Default is 16.
  • lora_alpha (int) — Alpha parameter for LoRA. Default is 32.
  • lora_dropout (float) — Dropout rate for LoRA. Default is 0.05.
  • model_ref (Optional[str]) — Reference model for DPO trainer. Default is None.
  • dpo_beta (float) — Beta parameter for DPO trainer. Default is 0.1.
  • max_prompt_length (int) — Maximum length of the prompt. Default is 128.
  • max_completion_length (Optional[int]) — Maximum length of the completion. Default is None.
  • prompt_text_column (Optional[str]) — Column name for the prompt text. Default is None.
  • text_column (str) — Column name for the text data. Default is “text”.
  • rejected_text_column (Optional[str]) — Column name for the rejected text data. Default is None.
  • push_to_hub (bool) — Whether to push the model to the Hugging Face Hub. Default is False.
  • username (Optional[str]) — Hugging Face username for authentication. Default is None.
  • token (Optional[str]) — Hugging Face token for authentication. Default is None.
  • unsloth (bool) — Whether to use the unsloth library. Default is False.
  • distributed_backend (Optional[str]) — Backend to use for distributed training. Default is None.

LLMTrainingParams: Parameters for training a language model using the autotrain library.

class autotrain.trainers.sent_transformers.params.SentenceTransformersParams

< >

( data_path: str = None model: str = 'microsoft/mpnet-base' lr: float = 3e-05 epochs: int = 3 max_seq_length: int = 128 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: Optional = None logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: Optional = None save_total_limit: int = 1 token: Optional = None push_to_hub: bool = False eval_strategy: str = 'epoch' username: Optional = None log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 trainer: str = 'pair_score' sentence1_column: str = 'sentence1' sentence2_column: str = 'sentence2' sentence3_column: Optional = None target_column: Optional = None )

Parameters

  • data_path (str) — Path to the dataset.
  • model (str) — Name of the pre-trained model to use. Default is “microsoft/mpnet-base”.
  • lr (float) — Learning rate for training. Default is 3e-5.
  • epochs (int) — Number of training epochs. Default is 3.
  • max_seq_length (int) — Maximum sequence length for the input. Default is 128.
  • batch_size (int) — Batch size for training. Default is 8.
  • warmup_ratio (float) — Proportion of training to perform learning rate warmup. Default is 0.1.
  • gradient_accumulation (int) — Number of steps to accumulate gradients before updating. Default is 1.
  • optimizer (str) — Optimizer to use. Default is “adamw_torch”.
  • scheduler (str) — Learning rate scheduler to use. Default is “linear”.
  • weight_decay (float) — Weight decay to apply. Default is 0.0.
  • max_grad_norm (float) — Maximum gradient norm for clipping. Default is 1.0.
  • seed (int) — Random seed for reproducibility. Default is 42.
  • train_split (str) — Name of the training data split. Default is “train”.
  • valid_split (Optional[str]) — Name of the validation data split. Default is None.
  • logging_steps (int) — Number of steps between logging. Default is -1.
  • project_name (str) — Name of the project for output directory. Default is “project-name”.
  • auto_find_batch_size (bool) — Whether to automatically find the optimal batch size. Default is False.
  • mixed_precision (Optional[str]) — Mixed precision training mode (fp16, bf16, or None). Default is None.
  • save_total_limit (int) — Maximum number of checkpoints to save. Default is 1.
  • token (Optional[str]) — Token for accessing Hugging Face Hub. Default is None.
  • push_to_hub (bool) — Whether to push the model to Hugging Face Hub. Default is False.
  • eval_strategy (str) — Evaluation strategy to use. Default is “epoch”.
  • username (Optional[str]) — Hugging Face username. Default is None.
  • log (str) — Logging method for experiment tracking. Default is “none”.
  • early_stopping_patience (int) — Number of epochs with no improvement after which training will be stopped. Default is 5.
  • early_stopping_threshold (float) — Threshold for measuring the new optimum, to qualify as an improvement. Default is 0.01.
  • trainer (str) — Name of the trainer to use. Default is “pair_score”.
  • sentence1_column (str) — Name of the column containing the first sentence. Default is “sentence1”.
  • sentence2_column (str) — Name of the column containing the second sentence. Default is “sentence2”.
  • sentence3_column (Optional[str]) — Name of the column containing the third sentence (if applicable). Default is None.
  • target_column (Optional[str]) — Name of the column containing the target variable. Default is None.

SentenceTransformersParams is a configuration class for setting up parameters for training sentence transformers.

class autotrain.trainers.seq2seq.params.Seq2SeqParams

< >

( data_path: str = None model: str = 'google/flan-t5-base' username: Optional = None seed: int = 42 train_split: str = 'train' valid_split: Optional = None project_name: str = 'project-name' token: Optional = None push_to_hub: bool = False text_column: str = 'text' target_column: str = 'target' lr: float = 5e-05 epochs: int = 3 max_seq_length: int = 128 max_target_length: int = 128 batch_size: int = 2 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 logging_steps: int = -1 eval_strategy: str = 'epoch' auto_find_batch_size: bool = False mixed_precision: Optional = None save_total_limit: int = 1 peft: bool = False quantization: Optional = 'int8' lora_r: int = 16 lora_alpha: int = 32 lora_dropout: float = 0.05 target_modules: str = 'all-linear' log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

Parameters

  • data_path (str) — Path to the dataset.
  • model (str) — Name of the model to be used. Default is “google/flan-t5-base”.
  • username (Optional[str]) — Hugging Face Username.
  • seed (int) — Random seed for reproducibility. Default is 42.
  • train_split (str) — Name of the training data split. Default is “train”.
  • valid_split (Optional[str]) — Name of the validation data split.
  • project_name (str) — Name of the project or output directory. Default is “project-name”.
  • token (Optional[str]) — Hub Token for authentication.
  • push_to_hub (bool) — Whether to push the model to the Hugging Face Hub. Default is False.
  • text_column (str) — Name of the text column in the dataset. Default is “text”.
  • target_column (str) — Name of the target text column in the dataset. Default is “target”.
  • lr (float) — Learning rate for training. Default is 5e-5.
  • epochs (int) — Number of training epochs. Default is 3.
  • max_seq_length (int) — Maximum sequence length for input text. Default is 128.
  • max_target_length (int) — Maximum sequence length for target text. Default is 128.
  • batch_size (int) — Training batch size. Default is 2.
  • warmup_ratio (float) — Proportion of warmup steps. Default is 0.1.
  • gradient_accumulation (int) — Number of gradient accumulation steps. Default is 1.
  • optimizer (str) — Optimizer to be used. Default is “adamw_torch”.
  • scheduler (str) — Learning rate scheduler to be used. Default is “linear”.
  • weight_decay (float) — Weight decay for the optimizer. Default is 0.0.
  • max_grad_norm (float) — Maximum gradient norm for clipping. Default is 1.0.
  • logging_steps (int) — Number of steps between logging. Default is -1 (disabled).
  • eval_strategy (str) — Evaluation strategy. Default is “epoch”.
  • auto_find_batch_size (bool) — Whether to automatically find the batch size. Default is False.
  • mixed_precision (Optional[str]) — Mixed precision training mode (fp16, bf16, or None).
  • save_total_limit (int) — Maximum number of checkpoints to save. Default is 1.
  • peft (bool) — Whether to use Parameter-Efficient Fine-Tuning (PEFT). Default is False.
  • quantization (Optional[str]) — Quantization mode (int4, int8, or None). Default is “int8”.
  • lora_r (int) — LoRA-R parameter for PEFT. Default is 16.
  • lora_alpha (int) — LoRA-Alpha parameter for PEFT. Default is 32.
  • lora_dropout (float) — LoRA-Dropout parameter for PEFT. Default is 0.05.
  • target_modules (str) — Target modules for PEFT. Default is “all-linear”.
  • log (str) — Logging method for experiment tracking. Default is “none”.
  • early_stopping_patience (int) — Patience for early stopping. Default is 5.
  • early_stopping_threshold (float) — Threshold for early stopping. Default is 0.01.

Seq2SeqParams is a configuration class for sequence-to-sequence training parameters.

class autotrain.trainers.token_classification.params.TokenClassificationParams

< >

( data_path: str = None model: str = 'bert-base-uncased' lr: float = 5e-05 epochs: int = 3 max_seq_length: int = 128 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: Optional = None tokens_column: str = 'tokens' tags_column: str = 'tags' logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: Optional = None save_total_limit: int = 1 token: Optional = None push_to_hub: bool = False eval_strategy: str = 'epoch' username: Optional = None log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

Parameters

  • data_path (str) — Path to the dataset.
  • model (str) — Name of the model to use. Default is “bert-base-uncased”.
  • lr (float) — Learning rate. Default is 5e-5.
  • epochs (int) — Number of training epochs. Default is 3.
  • max_seq_length (int) — Maximum sequence length. Default is 128.
  • batch_size (int) — Training batch size. Default is 8.
  • warmup_ratio (float) — Warmup proportion. Default is 0.1.
  • gradient_accumulation (int) — Gradient accumulation steps. Default is 1.
  • optimizer (str) — Optimizer to use. Default is “adamw_torch”.
  • scheduler (str) — Scheduler to use. Default is “linear”.
  • weight_decay (float) — Weight decay. Default is 0.0.
  • max_grad_norm (float) — Maximum gradient norm. Default is 1.0.
  • seed (int) — Random seed. Default is 42.
  • train_split (str) — Name of the training split. Default is “train”.
  • valid_split (Optional[str]) — Name of the validation split. Default is None.
  • tokens_column (str) — Name of the tokens column. Default is “tokens”.
  • tags_column (str) — Name of the tags column. Default is “tags”.
  • logging_steps (int) — Number of steps between logging. Default is -1.
  • project_name (str) — Name of the project. Default is “project-name”.
  • auto_find_batch_size (bool) — Whether to automatically find the batch size. Default is False.
  • mixed_precision (Optional[str]) — Mixed precision setting (fp16, bf16, or None). Default is None.
  • save_total_limit (int) — Total number of checkpoints to save. Default is 1.
  • token (Optional[str]) — Hub token for authentication. Default is None.
  • push_to_hub (bool) — Whether to push the model to the Hugging Face hub. Default is False.
  • eval_strategy (str) — Evaluation strategy. Default is “epoch”.
  • username (Optional[str]) — Hugging Face username. Default is None.
  • log (str) — Logging method for experiment tracking. Default is “none”.
  • early_stopping_patience (int) — Patience for early stopping. Default is 5.
  • early_stopping_threshold (float) — Threshold for early stopping. Default is 0.01.

TokenClassificationParams is a configuration class for token classification training parameters.

class autotrain.trainers.extractive_question_answering.params.ExtractiveQuestionAnsweringParams

< >

( data_path: str = None model: str = 'bert-base-uncased' lr: float = 5e-05 epochs: int = 3 max_seq_length: int = 128 max_doc_stride: int = 128 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: Optional = None text_column: str = 'context' question_column: str = 'question' answer_column: str = 'answers' logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: Optional = None save_total_limit: int = 1 token: Optional = None push_to_hub: bool = False eval_strategy: str = 'epoch' username: Optional = None log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

Parameters

  • data_path (str) — Path to the dataset.
  • model (str) — Pre-trained model name. Default is “bert-base-uncased”.
  • lr (float) — Learning rate for the optimizer. Default is 5e-5.
  • epochs (int) — Number of training epochs. Default is 3.
  • max_seq_length (int) — Maximum sequence length for inputs. Default is 128.
  • max_doc_stride (int) — Maximum document stride for splitting context. Default is 128.
  • batch_size (int) — Batch size for training. Default is 8.
  • warmup_ratio (float) — Warmup proportion for learning rate scheduler. Default is 0.1.
  • gradient_accumulation (int) — Number of gradient accumulation steps. Default is 1.
  • optimizer (str) — Optimizer type. Default is “adamw_torch”.
  • scheduler (str) — Learning rate scheduler type. Default is “linear”.
  • weight_decay (float) — Weight decay for the optimizer. Default is 0.0.
  • max_grad_norm (float) — Maximum gradient norm for clipping. Default is 1.0.
  • seed (int) — Random seed for reproducibility. Default is 42.
  • train_split (str) — Name of the training data split. Default is “train”.
  • valid_split (Optional[str]) — Name of the validation data split. Default is None.
  • text_column (str) — Column name for context/text. Default is “context”.
  • question_column (str) — Column name for questions. Default is “question”.
  • answer_column (str) — Column name for answers. Default is “answers”.
  • logging_steps (int) — Number of steps between logging. Default is -1.
  • project_name (str) — Name of the project for output directory. Default is “project-name”.
  • auto_find_batch_size (bool) — Automatically find optimal batch size. Default is False.
  • mixed_precision (Optional[str]) — Mixed precision training mode (fp16, bf16, or None). Default is None.
  • save_total_limit (int) — Maximum number of checkpoints to save. Default is 1.
  • token (Optional[str]) — Authentication token for Hugging Face Hub. Default is None.
  • push_to_hub (bool) — Whether to push the model to Hugging Face Hub. Default is False.
  • eval_strategy (str) — Evaluation strategy during training. Default is “epoch”.
  • username (Optional[str]) — Hugging Face username for authentication. Default is None.
  • log (str) — Logging method for experiment tracking. Default is “none”.
  • early_stopping_patience (int) — Number of epochs with no improvement for early stopping. Default is 5.
  • early_stopping_threshold (float) — Threshold for early stopping improvement. Default is 0.01.

ExtractiveQuestionAnsweringParams

class autotrain.trainers.text_classification.params.TextClassificationParams

< >

( data_path: str = None model: str = 'bert-base-uncased' lr: float = 5e-05 epochs: int = 3 max_seq_length: int = 128 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: Optional = None text_column: str = 'text' target_column: str = 'target' logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: Optional = None save_total_limit: int = 1 token: Optional = None push_to_hub: bool = False eval_strategy: str = 'epoch' username: Optional = None log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

Parameters

  • data_path (str) — Path to the dataset.
  • model (str) — Name of the model to use. Default is “bert-base-uncased”.
  • lr (float) — Learning rate. Default is 5e-5.
  • epochs (int) — Number of training epochs. Default is 3.
  • max_seq_length (int) — Maximum sequence length. Default is 128.
  • batch_size (int) — Training batch size. Default is 8.
  • warmup_ratio (float) — Warmup proportion. Default is 0.1.
  • gradient_accumulation (int) — Number of gradient accumulation steps. Default is 1.
  • optimizer (str) — Optimizer to use. Default is “adamw_torch”.
  • scheduler (str) — Scheduler to use. Default is “linear”.
  • weight_decay (float) — Weight decay. Default is 0.0.
  • max_grad_norm (float) — Maximum gradient norm. Default is 1.0.
  • seed (int) — Random seed. Default is 42.
  • train_split (str) — Name of the training split. Default is “train”.
  • valid_split (Optional[str]) — Name of the validation split. Default is None.
  • text_column (str) — Name of the text column in the dataset. Default is “text”.
  • target_column (str) — Name of the target column in the dataset. Default is “target”.
  • logging_steps (int) — Number of steps between logging. Default is -1.
  • project_name (str) — Name of the project. Default is “project-name”.
  • auto_find_batch_size (bool) — Whether to automatically find the batch size. Default is False.
  • mixed_precision (Optional[str]) — Mixed precision setting (fp16, bf16, or None). Default is None.
  • save_total_limit (int) — Total number of checkpoints to save. Default is 1.
  • token (Optional[str]) — Hub token for authentication. Default is None.
  • push_to_hub (bool) — Whether to push the model to the hub. Default is False.
  • eval_strategy (str) — Evaluation strategy. Default is “epoch”.
  • username (Optional[str]) — Hugging Face username. Default is None.
  • log (str) — Logging method for experiment tracking. Default is “none”.
  • early_stopping_patience (int) — Number of epochs with no improvement after which training will be stopped. Default is 5.
  • early_stopping_threshold (float) — Threshold for measuring the new optimum to continue training. Default is 0.01.

TextClassificationParams is a configuration class for text classification training parameters.

class autotrain.trainers.text_regression.params.TextRegressionParams

< >

( data_path: str = None model: str = 'bert-base-uncased' lr: float = 5e-05 epochs: int = 3 max_seq_length: int = 128 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: Optional = None text_column: str = 'text' target_column: str = 'target' logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: Optional = None save_total_limit: int = 1 token: Optional = None push_to_hub: bool = False eval_strategy: str = 'epoch' username: Optional = None log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

Parameters

  • data_path (str) — Path to the dataset.
  • model (str) — Name of the pre-trained model to use. Default is “bert-base-uncased”.
  • lr (float) — Learning rate for the optimizer. Default is 5e-5.
  • epochs (int) — Number of training epochs. Default is 3.
  • max_seq_length (int) — Maximum sequence length for the inputs. Default is 128.
  • batch_size (int) — Batch size for training. Default is 8.
  • warmup_ratio (float) — Proportion of training to perform learning rate warmup. Default is 0.1.
  • gradient_accumulation (int) — Number of steps to accumulate gradients before updating. Default is 1.
  • optimizer (str) — Optimizer to use. Default is “adamw_torch”.
  • scheduler (str) — Learning rate scheduler to use. Default is “linear”.
  • weight_decay (float) — Weight decay to apply. Default is 0.0.
  • max_grad_norm (float) — Maximum norm for the gradients. Default is 1.0.
  • seed (int) — Random seed for reproducibility. Default is 42.
  • train_split (str) — Name of the training data split. Default is “train”.
  • valid_split (Optional[str]) — Name of the validation data split. Default is None.
  • text_column (str) — Name of the column containing text data. Default is “text”.
  • target_column (str) — Name of the column containing target data. Default is “target”.
  • logging_steps (int) — Number of steps between logging. Default is -1 (no logging).
  • project_name (str) — Name of the project for output directory. Default is “project-name”.
  • auto_find_batch_size (bool) — Whether to automatically find the batch size. Default is False.
  • mixed_precision (Optional[str]) — Mixed precision training mode (fp16, bf16, or None). Default is None.
  • save_total_limit (int) — Maximum number of checkpoints to save. Default is 1.
  • token (Optional[str]) — Token for accessing Hugging Face Hub. Default is None.
  • push_to_hub (bool) — Whether to push the model to Hugging Face Hub. Default is False.
  • eval_strategy (str) — Evaluation strategy to use. Default is “epoch”.
  • username (Optional[str]) — Hugging Face username. Default is None.
  • log (str) — Logging method for experiment tracking. Default is “none”.
  • early_stopping_patience (int) — Number of epochs with no improvement after which training will be stopped. Default is 5.
  • early_stopping_threshold (float) — Threshold for measuring the new optimum, to qualify as an improvement. Default is 0.01.

TextRegressionParams is a configuration class for setting up text regression training parameters.

Image Tasks

class autotrain.trainers.image_classification.params.ImageClassificationParams

< >

( data_path: str = None model: str = 'google/vit-base-patch16-224' username: Optional = None lr: float = 5e-05 epochs: int = 3 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: Optional = None logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: Optional = None save_total_limit: int = 1 token: Optional = None push_to_hub: bool = False eval_strategy: str = 'epoch' image_column: str = 'image' target_column: str = 'target' log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

Parameters

  • data_path (str) — Path to the dataset.
  • model (str) — Pre-trained model name or path. Default is “google/vit-base-patch16-224”.
  • username (Optional[str]) — Hugging Face account username.
  • lr (float) — Learning rate for the optimizer. Default is 5e-5.
  • epochs (int) — Number of epochs for training. Default is 3.
  • batch_size (int) — Batch size for training. Default is 8.
  • warmup_ratio (float) — Warmup ratio for learning rate scheduler. Default is 0.1.
  • gradient_accumulation (int) — Number of gradient accumulation steps. Default is 1.
  • optimizer (str) — Optimizer type. Default is “adamw_torch”.
  • scheduler (str) — Learning rate scheduler type. Default is “linear”.
  • weight_decay (float) — Weight decay for the optimizer. Default is 0.0.
  • max_grad_norm (float) — Maximum gradient norm for clipping. Default is 1.0.
  • seed (int) — Random seed for reproducibility. Default is 42.
  • train_split (str) — Name of the training data split. Default is “train”.
  • valid_split (Optional[str]) — Name of the validation data split.
  • logging_steps (int) — Number of steps between logging. Default is -1.
  • project_name (str) — Name of the project for output directory. Default is “project-name”.
  • auto_find_batch_size (bool) — Automatically find optimal batch size. Default is False.
  • mixed_precision (Optional[str]) — Mixed precision training mode (fp16, bf16, or None).
  • save_total_limit (int) — Maximum number of checkpoints to keep. Default is 1.
  • token (Optional[str]) — Hugging Face Hub token for authentication.
  • push_to_hub (bool) — Whether to push the model to Hugging Face Hub. Default is False.
  • eval_strategy (str) — Evaluation strategy during training. Default is “epoch”.
  • image_column (str) — Column name for images in the dataset. Default is “image”.
  • target_column (str) — Column name for target labels in the dataset. Default is “target”.
  • log (str) — Logging method for experiment tracking. Default is “none”.
  • early_stopping_patience (int) — Number of epochs with no improvement for early stopping. Default is 5.
  • early_stopping_threshold (float) — Threshold for early stopping. Default is 0.01.

ImageClassificationParams is a configuration class for image classification training parameters.

class autotrain.trainers.image_regression.params.ImageRegressionParams

< >

( data_path: str = None model: str = 'google/vit-base-patch16-224' username: Optional = None lr: float = 5e-05 epochs: int = 3 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: Optional = None logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: Optional = None save_total_limit: int = 1 token: Optional = None push_to_hub: bool = False eval_strategy: str = 'epoch' image_column: str = 'image' target_column: str = 'target' log: str = 'none' early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

Parameters

  • data_path (str) — Path to the dataset.
  • model (str) — Name of the model to use. Default is “google/vit-base-patch16-224”.
  • username (Optional[str]) — Hugging Face Username.
  • lr (float) — Learning rate. Default is 5e-5.
  • epochs (int) — Number of training epochs. Default is 3.
  • batch_size (int) — Training batch size. Default is 8.
  • warmup_ratio (float) — Warmup proportion. Default is 0.1.
  • gradient_accumulation (int) — Gradient accumulation steps. Default is 1.
  • optimizer (str) — Optimizer to use. Default is “adamw_torch”.
  • scheduler (str) — Scheduler to use. Default is “linear”.
  • weight_decay (float) — Weight decay. Default is 0.0.
  • max_grad_norm (float) — Max gradient norm. Default is 1.0.
  • seed (int) — Random seed. Default is 42.
  • train_split (str) — Train split name. Default is “train”.
  • valid_split (Optional[str]) — Validation split name.
  • logging_steps (int) — Logging steps. Default is -1.
  • project_name (str) — Output directory name. Default is “project-name”.
  • auto_find_batch_size (bool) — Whether to auto find batch size. Default is False.
  • mixed_precision (Optional[str]) — Mixed precision type (fp16, bf16, or None).
  • save_total_limit (int) — Save total limit. Default is 1.
  • token (Optional[str]) — Hub Token.
  • push_to_hub (bool) — Whether to push to hub. Default is False.
  • eval_strategy (str) — Evaluation strategy. Default is “epoch”.
  • image_column (str) — Image column name. Default is “image”.
  • target_column (str) — Target column name. Default is “target”.
  • log (str) — Logging using experiment tracking. Default is “none”.
  • early_stopping_patience (int) — Early stopping patience. Default is 5.
  • early_stopping_threshold (float) — Early stopping threshold. Default is 0.01.

ImageRegressionParams is a configuration class for image regression training parameters.

class autotrain.trainers.object_detection.params.ObjectDetectionParams

< >

( data_path: str = None model: str = 'google/vit-base-patch16-224' username: Optional = None lr: float = 5e-05 epochs: int = 3 batch_size: int = 8 warmup_ratio: float = 0.1 gradient_accumulation: int = 1 optimizer: str = 'adamw_torch' scheduler: str = 'linear' weight_decay: float = 0.0 max_grad_norm: float = 1.0 seed: int = 42 train_split: str = 'train' valid_split: Optional = None logging_steps: int = -1 project_name: str = 'project-name' auto_find_batch_size: bool = False mixed_precision: Optional = None save_total_limit: int = 1 token: Optional = None push_to_hub: bool = False eval_strategy: str = 'epoch' image_column: str = 'image' objects_column: str = 'objects' log: str = 'none' image_square_size: Optional = 600 early_stopping_patience: int = 5 early_stopping_threshold: float = 0.01 )

Parameters

  • data_path (str) — Path to the dataset.
  • model (str) — Name of the model to be used. Default is “google/vit-base-patch16-224”.
  • username (Optional[str]) — Hugging Face Username.
  • lr (float) — Learning rate. Default is 5e-5.
  • epochs (int) — Number of training epochs. Default is 3.
  • batch_size (int) — Training batch size. Default is 8.
  • warmup_ratio (float) — Warmup proportion. Default is 0.1.
  • gradient_accumulation (int) — Gradient accumulation steps. Default is 1.
  • optimizer (str) — Optimizer to be used. Default is “adamw_torch”.
  • scheduler (str) — Scheduler to be used. Default is “linear”.
  • weight_decay (float) — Weight decay. Default is 0.0.
  • max_grad_norm (float) — Max gradient norm. Default is 1.0.
  • seed (int) — Random seed. Default is 42.
  • train_split (str) — Name of the training data split. Default is “train”.
  • valid_split (Optional[str]) — Name of the validation data split.
  • logging_steps (int) — Number of steps between logging. Default is -1.
  • project_name (str) — Name of the project for output directory. Default is “project-name”.
  • auto_find_batch_size (bool) — Whether to automatically find batch size. Default is False.
  • mixed_precision (Optional[str]) — Mixed precision type (fp16, bf16, or None).
  • save_total_limit (int) — Total number of checkpoints to save. Default is 1.
  • token (Optional[str]) — Hub Token for authentication.
  • push_to_hub (bool) — Whether to push the model to the Hugging Face Hub. Default is False.
  • eval_strategy (str) — Evaluation strategy. Default is “epoch”.
  • image_column (str) — Name of the image column in the dataset. Default is “image”.
  • objects_column (str) — Name of the target column in the dataset. Default is “objects”.
  • log (str) — Logging method for experiment tracking. Default is “none”.
  • image_square_size (Optional[int]) — Longest size to which the image will be resized, then padded to square. Default is 600.
  • early_stopping_patience (int) — Number of epochs with no improvement after which training will be stopped. Default is 5.
  • early_stopping_threshold (float) — Minimum change to qualify as an improvement. Default is 0.01.

ObjectDetectionParams is a configuration class for object detection training parameters.

class autotrain.trainers.dreambooth.params.DreamBoothTrainingParams

< >

( model: str = None vae_model: Optional = None revision: Optional = None tokenizer: Optional = None image_path: str = None class_image_path: Optional = None prompt: str = None class_prompt: Optional = None num_class_images: int = 100 class_labels_conditioning: Optional = None prior_preservation: bool = False prior_loss_weight: float = 1.0 project_name: str = 'dreambooth-model' seed: int = 42 resolution: int = 512 center_crop: bool = False train_text_encoder: bool = False batch_size: int = 4 sample_batch_size: int = 4 epochs: int = 1 num_steps: int = None checkpointing_steps: int = 500 resume_from_checkpoint: Optional = None gradient_accumulation: int = 1 disable_gradient_checkpointing: bool = False lr: float = 0.0001 scale_lr: bool = False scheduler: str = 'constant' warmup_steps: int = 0 num_cycles: int = 1 lr_power: float = 1.0 dataloader_num_workers: int = 0 use_8bit_adam: bool = False adam_beta1: float = 0.9 adam_beta2: float = 0.999 adam_weight_decay: float = 0.01 adam_epsilon: float = 1e-08 max_grad_norm: float = 1.0 allow_tf32: bool = False prior_generation_precision: Optional = None local_rank: int = -1 xformers: bool = False pre_compute_text_embeddings: bool = False tokenizer_max_length: Optional = None text_encoder_use_attention_mask: bool = False rank: int = 4 xl: bool = False mixed_precision: Optional = None token: Optional = None push_to_hub: bool = False username: Optional = None validation_prompt: Optional = None num_validation_images: int = 4 validation_epochs: int = 50 checkpoints_total_limit: Optional = None validation_images: Optional = None logging: bool = False )

Parameters

  • model (str) — Name of the model to be used for training.
  • vae_model (Optional[str]) — Name of the VAE model to be used, if any.
  • revision (Optional[str]) — Specific model version to use.
  • tokenizer (Optional[str]) — Tokenizer to be used, if different from the model.
  • image_path (str) — Path to the training images.
  • class_image_path (Optional[str]) — Path to the class images.
  • prompt (str) — Prompt for the instance images.
  • class_prompt (Optional[str]) — Prompt for the class images.
  • num_class_images (int) — Number of class images to generate.
  • class_labels_conditioning (Optional[str]) — Conditioning labels for class images.
  • prior_preservation (bool) — Enable prior preservation during training.
  • prior_loss_weight (float) — Weight of the prior preservation loss.
  • project_name (str) — Name of the project for output directory.
  • seed (int) — Random seed for reproducibility.
  • resolution (int) — Resolution of the training images.
  • center_crop (bool) — Enable center cropping of images.
  • train_text_encoder (bool) — Enable training of the text encoder.
  • batch_size (int) — Batch size for training.
  • sample_batch_size (int) — Batch size for sampling.
  • epochs (int) — Number of training epochs.
  • num_steps (int) — Maximum number of training steps.
  • checkpointing_steps (int) — Steps interval for checkpointing.
  • resume_from_checkpoint (Optional[str]) — Path to resume training from a checkpoint.
  • gradient_accumulation (int) — Number of gradient accumulation steps.
  • disable_gradient_checkpointing (bool) — Disable gradient checkpointing.
  • lr (float) — Learning rate for training.
  • scale_lr (bool) — Enable scaling of the learning rate.
  • scheduler (str) — Type of learning rate scheduler.
  • warmup_steps (int) — Number of warmup steps for learning rate scheduler.
  • num_cycles (int) — Number of cycles for learning rate scheduler.
  • lr_power (float) — Power factor for learning rate scheduler.
  • dataloader_num_workers (int) — Number of workers for data loading.
  • use_8bit_adam (bool) — Enable use of 8-bit Adam optimizer.
  • adam_beta1 (float) — Beta1 parameter for Adam optimizer.
  • adam_beta2 (float) — Beta2 parameter for Adam optimizer.
  • adam_weight_decay (float) — Weight decay for Adam optimizer.
  • adam_epsilon (float) — Epsilon parameter for Adam optimizer.
  • max_grad_norm (float) — Maximum gradient norm for clipping.
  • allow_tf32 (bool) — Allow use of TF32 for training.
  • prior_generation_precision (Optional[str]) — Precision for prior generation.
  • local_rank (int) — Local rank for distributed training.
  • xformers (bool) — Enable xformers memory efficient attention.
  • pre_compute_text_embeddings (bool) — Pre-compute text embeddings before training.
  • tokenizer_max_length (Optional[int]) — Maximum length for tokenizer.
  • text_encoder_use_attention_mask (bool) — Use attention mask for text encoder.
  • rank (int) — Rank for distributed training.
  • xl (bool) — Enable XL model training.
  • mixed_precision (Optional[str]) — Enable mixed precision training.
  • token (Optional[str]) — Token for accessing the model hub.
  • push_to_hub (bool) — Enable pushing the model to the hub.
  • username (Optional[str]) — Username for the model hub.
  • validation_prompt (Optional[str]) — Prompt for validation images.
  • num_validation_images (int) — Number of validation images to generate.
  • validation_epochs (int) — Epoch interval for validation.
  • checkpoints_total_limit (Optional[int]) — Total limit for checkpoints.
  • validation_images (Optional[str]) — Path to validation images.
  • logging (bool) — Enable logging using TensorBoard.

DreamBoothTrainingParams

Tabular Tasks

class autotrain.trainers.tabular.params.TabularParams

< >

( data_path: str = None model: str = 'xgboost' username: Optional = None seed: int = 42 train_split: str = 'train' valid_split: Optional = None project_name: str = 'project-name' token: Optional = None push_to_hub: bool = False id_column: str = 'id' target_columns: Union = ['target'] categorical_columns: Optional = None numerical_columns: Optional = None task: str = 'classification' num_trials: int = 10 time_limit: int = 600 categorical_imputer: Optional = None numerical_imputer: Optional = None numeric_scaler: Optional = None )

Parameters

  • data_path (str) — Path to the dataset.
  • model (str) — Name of the model to use. Default is “xgboost”.
  • username (Optional[str]) — Hugging Face Username.
  • seed (int) — Random seed for reproducibility. Default is 42.
  • train_split (str) — Name of the training data split. Default is “train”.
  • valid_split (Optional[str]) — Name of the validation data split.
  • project_name (str) — Name of the output directory. Default is “project-name”.
  • token (Optional[str]) — Hub Token for authentication.
  • push_to_hub (bool) — Whether to push the model to the hub. Default is False.
  • id_column (str) — Name of the ID column. Default is “id”.
  • target_columns (Union[List[str], str]) — Target column(s) in the dataset. Default is [“target”].
  • categorical_columns (Optional[List[str]]) — List of categorical columns.
  • numerical_columns (Optional[List[str]]) — List of numerical columns.
  • task (str) — Type of task (e.g., “classification”). Default is “classification”.
  • num_trials (int) — Number of trials for hyperparameter optimization. Default is 10.
  • time_limit (int) — Time limit for training in seconds. Default is 600.
  • categorical_imputer (Optional[str]) — Imputer strategy for categorical columns.
  • numerical_imputer (Optional[str]) — Imputer strategy for numerical columns.
  • numeric_scaler (Optional[str]) — Scaler strategy for numerical columns.

TabularParams is a configuration class for tabular data training parameters.

< > Update on GitHub