Accelerate documentation

Utilities for DeepSpeed

You are viewing v0.12.0 version. A newer version v1.3.0 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Utilities for DeepSpeed

class accelerate.DeepSpeedPlugin

< >

( hf_ds_config: typing.Any = None gradient_accumulation_steps: int = None gradient_clipping: float = None zero_stage: int = None is_train_batch_min: str = True offload_optimizer_device: bool = None offload_param_device: bool = None zero3_init_flag: bool = None zero3_save_16bit_model: bool = None )

This plugin is used to integrate DeepSpeed.

deepspeed_config_process

< >

( prefix = '' mismatches = None config = None must_match = True **kwargs )

Process the DeepSpeed config with the values from the kwargs.

class accelerate.utils.DummyOptim

< >

( params lr = 0.001 weight_decay = 0 **kwargs )

Parameters

  • lr (float) — Learning rate.
  • params (iterable) — iterable of parameters to optimize or dicts defining parameter groups
  • weight_decay (float) — Weight decay. **kwargs — Other arguments.

Dummy optimizer presents model parameters or param groups, this is primarily used to follow conventional training loop when optimizer config is specified in the deepspeed config file.

class accelerate.utils.DummyScheduler

< >

( optimizer total_num_steps = None warmup_num_steps = 0 **kwargs )

Parameters

  • optimizer (torch.optim.optimizer.Optimizer) — The optimizer to wrap.
  • total_num_steps (int) — Total number of steps.
  • warmup_num_steps (int) — Number of steps for warmup. **kwargs — Other arguments.

Dummy scheduler presents model parameters or param groups, this is primarily used to follow conventional training loop when scheduler config is specified in the deepspeed config file.

class accelerate.utils.DeepSpeedEngineWrapper

< >

( engine )

Parameters

  • engine (deepspeed.runtime.engine.DeepSpeedEngine) — deepspeed engine to wrap

Internal wrapper for deepspeed.runtime.engine.DeepSpeedEngine. This is used to follow conventional training loop.

class accelerate.utils.DeepSpeedOptimizerWrapper

< >

( optimizer )

Parameters

  • optimizer (torch.optim.optimizer.Optimizer) — The optimizer to wrap.

Internal wrapper around a deepspeed optimizer.

class accelerate.utils.DeepSpeedSchedulerWrapper

< >

( scheduler optimizers )

Parameters

  • scheduler (torch.optim.lr_scheduler.LambdaLR) — The scheduler to wrap.
  • optimizers (one or a list of torch.optim.Optimizer) —

Internal wrapper around a deepspeed scheduler.