Bitsandbytes documentation

RMSprop

You are viewing v0.43.3 version. A newer version v0.44.1 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

RMSprop

RMSprop is an adaptive learning rate optimizer that is very similar to Adagrad. RMSprop stores a weighted average of the squared past gradients for each parameter and uses it to scale their learning rate. This allows the learning rate to be automatically lower or higher depending on the magnitude of the gradient, and it prevents the learning rate from diminishing.

RMSprop

class bitsandbytes.optim.RMSprop

< >

( params lr = 0.01 alpha = 0.99 eps = 1e-08 weight_decay = 0 momentum = 0 centered = False optim_bits = 32 args = None min_8bit_size = 4096 percentile_clipping = 100 block_wise = True )

RMSprop8bit

class bitsandbytes.optim.RMSprop8bit

< >

( params lr = 0.01 alpha = 0.99 eps = 1e-08 weight_decay = 0 momentum = 0 centered = False args = None min_8bit_size = 4096 percentile_clipping = 100 block_wise = True )

RMSprop32bit

class bitsandbytes.optim.RMSprop32bit

< >

( params lr = 0.01 alpha = 0.99 eps = 1e-08 weight_decay = 0 momentum = 0 centered = False args = None min_8bit_size = 4096 percentile_clipping = 100 block_wise = True )

< > Update on GitHub