torch.optim.lr_scheduler.ReduceLROnPlateau
class ReduceLROnPlateaunew ReduceLROnPlateau(optimizer: Optimizer, options: {
/** One of 'min', 'max' (default: 'min') */
mode?: PlateauMode;
/** Factor by which the learning rate will be reduced (default: 0.1) */
factor?: number;
/** Number of epochs with no improvement (default: 10) */
patience?: number;
/** Threshold for measuring the new optimum (default: 1e-4) */
threshold?: number;
/** One of 'rel', 'abs' (default: 'rel') */
threshold_mode?: 'rel' | 'abs';
/** Number of epochs to wait before resuming normal operation (default: 0) */
cooldown?: number;
/** Lower bound on the learning rate (default: 0) */
min_lr?: number | number[];
/** Minimal decay applied to lr (default: 1e-8) */
eps?: number;
/** Whether to print a message for each update (default: false) */
verbose?: boolean;
} = {})
Constructor Parameters
optimizerOptimizer- Wrapped optimizer
options{ /** One of 'min', 'max' (default: 'min') */ mode?: PlateauMode; /** Factor by which the learning rate will be reduced (default: 0.1) */ factor?: number; /** Number of epochs with no improvement (default: 10) */ patience?: number; /** Threshold for measuring the new optimum (default: 1e-4) */ threshold?: number; /** One of 'rel', 'abs' (default: 'rel') */ threshold_mode?: 'rel' | 'abs'; /** Number of epochs to wait before resuming normal operation (default: 0) */ cooldown?: number; /** Lower bound on the learning rate (default: 0) */ min_lr?: number | number[]; /** Minimal decay applied to lr (default: 1e-8) */ eps?: number; /** Whether to print a message for each update (default: false) */ verbose?: boolean; }optional- Scheduler options
optimizer(Optimizer)- – The optimizer being scheduled
mode(PlateauMode)- – Mode: 'min' or 'max'
factor(number)- – Factor by which the learning rate will be reduced
patience(number)- – Number of epochs with no improvement after which LR will be reduced
threshold(number)- – Threshold for measuring the new optimum
threshold_mode('rel' | 'abs')- – Mode for threshold comparison
cooldown(number)- – Number of epochs to wait before resuming normal operation
min_lr(number | number[])- – Lower bound on the learning rate
eps(number)- – Minimal decay applied to lr
verbose(boolean)- – Whether to print LR changes
ReduceLROnPlateau scheduler: Reduce learning rate when metric plateaus.
ReduceLROnPlateau is a metric-based learning rate scheduler. Unlike epoch-based schedulers (StepLR, CosineAnnealingLR), it monitors a metric (e.g., validation loss) and reduces the learning rate when the metric stops improving. This is more adaptive and doesn't require knowing training duration in advance.
Key advantages:
- Adaptive: Responds to actual training progress, not fixed epochs
- No duration knowledge: Works without knowing total training steps
- Metric-aware: Uses validation performance to decide when to decay
- Flexible: Can be combined with any optimizer and other schedules
When to use ReduceLROnPlateau:
- When you don't know good decay epochs in advance
- Want learning rate to adapt to actual training progress
- Metric-based adaptation is acceptable (not for critical timing needs)
- Fine-tuning or transfer learning (where epochs vary)
- Avoiding rigid schedules that may decay at suboptimal times
Trade-offs:
- Requires passing metric value each step (manual)
- Requires monitoring metric (like validation loss)
- Less predictable than fixed-epoch schedules
- Can be slower to adapt if patience is large
- Different from epoch-based schedulers in semantics
Algorithm: Monitors metric and reduces learning rate when plateau detected:
- Track best metric value seen so far
- If metric doesn't improve for 'patience' checks, reduce lr by 'factor'
- Reset patience counter after reduction
- Improvement threshold specified by 'threshold' parameter
- Metric-driven: Reduces lr based on actual performance, not fixed epochs.
- Manual passing: Must explicitly call step(metric) with validation metric.
- Best practice: Use validation loss/accuracy, not training metrics.
- Patience critical: Too small → aggressive decay, too large → slow adaptation.
- Cooldown useful: Prevents rapid successive reductions from metric noise.
- min_lr important: Prevents learning rate from becoming zero.
- Comparison: StepLR is rigid, ReduceOnPlateau is adaptive.
- Combos: Can be used with warmup (LinearLR) in SequentialLR.
- Not epoch-based: step(metric) not step(), semantically different.
- Popular choice: Standard for transfer learning and fine-tuning.
Examples
// Reduce lr when validation loss stops improving
const scheduler = new torch.optim.ReduceLROnPlateau(optimizer, {
mode: 'min', // Minimize loss
factor: 0.1, // Reduce by 10x
patience: 10, // Wait 10 epochs
threshold: 1e-4
});
for (let epoch = 0; epoch < 100; epoch++) {
train();
const val_loss = validate();
scheduler.step(val_loss); // Pass metric to scheduler
}// Maximize accuracy (e.g., for classification)
const scheduler = new torch.optim.ReduceLROnPlateau(optimizer, {
mode: 'max', // Maximize accuracy
factor: 0.5, // Reduce by 50%
patience: 5, // Less patient, decay sooner
threshold: 0.0001 // Threshold for improvement
});// Conservative: large patience, small decay
const scheduler = new torch.optim.ReduceLROnPlateau(optimizer, {
mode: 'min',
factor: 0.5, // Reduce by 50% (not as aggressive)
patience: 20, // Wait 20 epochs before reducing
min_lr: 1e-6 // Don't go below 1e-6
});// Aggressive: small patience, large decay
const scheduler = new torch.optim.ReduceLROnPlateau(optimizer, {
mode: 'min',
factor: 0.1, // Reduce by 90% (very aggressive)
patience: 3, // Only wait 3 epochs
cooldown: 1 // Wait 1 epoch after reducing before monitoring again
});