Variance exploding, stochastic sampling from Karras et. al
Overview
Original paper can be found here.
KarrasVeScheduler
class diffusers.KarrasVeScheduler
< source >( sigma_min: float = 0.02 sigma_max: float = 100 s_noise: float = 1.007 s_churn: float = 80 s_min: float = 0.05 s_max: float = 50 )
Parameters
-
sigma_min (
float
) — minimum noise magnitude -
sigma_max (
float
) — maximum noise magnitude -
s_noise (
float
) — the amount of additional noise to counteract loss of detail during sampling. A reasonable range is [1.000, 1.011]. -
s_churn (
float
) — the parameter controlling the overall amount of stochasticity. A reasonable range is [0, 100]. -
s_min (
float
) — the start value of the sigma range where we add noise (enable stochasticity). A reasonable range is [0, 10]. -
s_max (
float
) — the end value of the sigma range where we add noise. A reasonable range is [0.2, 80].
Stochastic sampling from Karras et al. [1] tailored to the Variance-Expanding (VE) models [2]. Use Algorithm 2 and the VE column of Table 1 from [1] for reference.
[1] Karras, Tero, et al. “Elucidating the Design Space of Diffusion-Based Generative Models.” https://arxiv.org/abs/2206.00364 [2] Song, Yang, et al. “Score-based generative modeling through stochastic differential equations.” https://arxiv.org/abs/2011.13456
~ConfigMixin takes care of storing all config attributes that are passed in the scheduler’s __init__
function, such as num_train_timesteps
. They can be accessed via scheduler.config.num_train_timesteps
.
SchedulerMixin provides general loading and saving functionality via the SchedulerMixin.save_pretrained() and
from_pretrained() functions.
For more details on the parameters, see the original paper’s Appendix E.: “Elucidating the Design Space of Diffusion-Based Generative Models.” https://arxiv.org/abs/2206.00364. The grid search values used to find the optimal {s_noise, s_churn, s_min, s_max} for a specific model are described in Table 5 of the paper.
add_noise_to_input
< source >( sample: FloatTensor sigma: float generator: typing.Optional[torch._C.Generator] = None )
Explicit Langevin-like “churn” step of adding noise to the sample according to a factor gamma_i ≥ 0 to reach a higher noise level sigma_hat = sigma_i + gamma_i*sigma_i.
TODO Args:
scale_model_input
< source >(
sample: FloatTensor
timestep: typing.Optional[int] = None
)
→
torch.FloatTensor
Ensures interchangeability with schedulers that need to scale the denoising model input depending on the current timestep.
set_timesteps
< source >( num_inference_steps: int device: typing.Union[str, torch.device] = None )
Sets the continuous timesteps used for the diffusion chain. Supporting function to be run before inference.
step
< source >(
model_output: FloatTensor
sigma_hat: float
sigma_prev: float
sample_hat: FloatTensor
return_dict: bool = True
)
→
KarrasVeOutput
or tuple
Parameters
-
model_output (
torch.FloatTensor
) — direct output from learned diffusion model. -
sigma_hat (
float
) — TODO -
sigma_prev (
float
) — TODO -
sample_hat (
torch.FloatTensor
) — TODO -
return_dict (
bool
) — option for returning tuple rather than KarrasVeOutput classKarrasVeOutput — updated sample in the diffusion chain and derivative (TODO double check).
Returns
KarrasVeOutput
or tuple
KarrasVeOutput
if return_dict
is True, otherwise a tuple
. When
returning a tuple, the first element is the sample tensor.
Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion process from the learned model outputs (most often the predicted noise).
step_correct
< source >( model_output: FloatTensor sigma_hat: float sigma_prev: float sample_hat: FloatTensor sample_prev: FloatTensor derivative: FloatTensor return_dict: bool = True ) → prev_sample (TODO)
Parameters
-
model_output (
torch.FloatTensor
) — direct output from learned diffusion model. -
sigma_hat (
float
) — TODO -
sigma_prev (
float
) — TODO -
sample_hat (
torch.FloatTensor
) — TODO -
sample_prev (
torch.FloatTensor
) — TODO -
derivative (
torch.FloatTensor
) — TODO -
return_dict (
bool
) — option for returning tuple rather than KarrasVeOutput class
Returns
prev_sample (TODO)
updated sample in the diffusion chain. derivative (TODO): TODO
Correct the predicted sample based on the output model_output of the network. TODO complete description