torch.optim.lr_scheduler.ExponentialLR
class ExponentialLR extends LRSchedulernew ExponentialLR(optimizer: Optimizer, options: {
/** Multiplicative factor of learning rate decay */
gamma: number;
/** The index of last epoch (default: -1) */
last_epoch?: number;
/** Whether to print a message for each update (default: false) */
verbose?: boolean;
})
Constructor Parameters
optimizerOptimizer- Wrapped optimizer
options{ /** Multiplicative factor of learning rate decay */ gamma: number; /** The index of last epoch (default: -1) */ last_epoch?: number; /** Whether to print a message for each update (default: false) */ verbose?: boolean; }- Scheduler options
gamma(number)- – Multiplicative factor of learning rate decay
ExponentialLR scheduler: Exponential decay of learning rate every epoch.
ExponentialLR multiplies the learning rate by gamma at every epoch, creating exponential decay: η_t = η_0 * γ^t. The learning rate decays smoothly and continuously (unlike StepLR), but the decay is independent of epoch count (unlike CosineAnnealingLR).
Comparison to alternatives:
- StepLR: Decays at fixed intervals (step-wise, abrupt)
- ExponentialLR: Decays every epoch by constant factor (smooth, continuous)
- CosineAnnealingLR: Smooth decay with cosine curve (better empirically)
When to use ExponentialLR:
- Simpler alternative to CosineAnnealingLR when you want smooth decay
- When you know good exponential decay rate for your problem
- Theoretically motivated for certain convex problems
- Generally inferior to CosineAnnealingLR in practice (prefer cosine)
Trade-offs:
- Smooth decay every epoch (not step-wise)
- Decays continuously toward zero, can undershoot
- Doesn't have the theoretical motivation of cosine annealing
- Less commonly used in modern deep learning (CosineAnnealingLR preferred)
Algorithm: Multiplies learning rate by gamma every epoch:
- η_t = η_0 * γ^t
- Example: η_0 = 0.1, γ = 0.95 → decays by 5% per epoch
- Smooth decay: Unlike StepLR, decays smoothly at every epoch (continuous).
- Exponential vs cosine: CosineAnnealingLR usually better empirically, prefer it.
- Decay rate critical: gamma value strongly affects final lr. Test different values.
- Continuous decay: Gradually approaches zero, no minimum lr control.
- Simple formula: Easy to understand: lr *= gamma each epoch.
- Parameter groups: Works with different learning rates per parameter group.
- Less common: Modern practice favors CosineAnnealingLR or ReduceOnPlateau.
Examples
// Standard ExponentialLR: decay by 5% per epoch
const scheduler = new torch.optim.ExponentialLR(optimizer, { gamma: 0.95 });
for (let epoch = 0; epoch < 100; epoch++) {
train();
validate();
scheduler.step();
}// Different decay rates
const slow_decay = new torch.optim.ExponentialLR(optimizer, { gamma: 0.99 }); // 1% per epoch
const fast_decay = new torch.optim.ExponentialLR(optimizer, { gamma: 0.90 }); // 10% per epoch
// Slower decay (γ closer to 1) → more gradual decrease
// Faster decay (γ closer to 0) → steeper decrease// Resume from checkpoint
const checkpoint = load_checkpoint('model.pth');
const scheduler = new torch.optim.ExponentialLR(optimizer, {
gamma: 0.95,
last_epoch: checkpoint.epoch - 1 // Resume at correct epoch
});// Comparison: ExponentialLR vs CosineAnnealingLR
const exp_scheduler = new torch.optim.ExponentialLR(optimizer, { gamma: 0.95 });
const cos_scheduler = new torch.optim.CosineAnnealingLR(optimizer, { T_max: 100 });
// ExponentialLR: simple but no control over final lr, continuous decay
// CosineAnnealingLR: better empirical results, explicit minimum lr control
// For modern DL: CosineAnnealingLR is generally recommended