torch.optim.lr_scheduler.SequentialLR
class SequentialLRnew SequentialLR(optimizer: Optimizer, options: {
/** List of chained schedulers */
schedulers: LRScheduler[];
/** List of epoch indices when to switch schedulers */
milestones: number[];
/** The index of last epoch (default: -1) */
last_epoch?: number;
})
Constructor Parameters
optimizerOptimizer- Wrapped optimizer
options{ /** List of chained schedulers */ schedulers: LRScheduler[]; /** List of epoch indices when to switch schedulers */ milestones: number[]; /** The index of last epoch (default: -1) */ last_epoch?: number; }- Scheduler options
optimizer(Optimizer)- – The optimizer being scheduled
schedulers(LRScheduler[])- – List of schedulers
milestones(number[])- – Milestones for switching schedulers
last_epoch(number)- – Current epoch
SequentialLR scheduler: Switch between schedulers at specified milestones.
SequentialLR composes multiple schedulers, switching from one to the next at specified epoch milestones. Perfect for combining independent schedule phases (e.g., warmup then cosine annealing, or constant then exponential decay).
Common patterns:
- Warmup + Decay: LinearLR warmup → CosineAnnealingLR main training
- Multi-phase: Different schedules for different training phases
- Constant + Decay: ConstantLR warmup → StepLR decay
Key differences from ChainedScheduler:
- SequentialLR: Switches at milestones (one active scheduler per epoch)
- ChainedScheduler: All schedulers run simultaneously each step
Use cases:
- Warmup then main training schedule (most common)
- Multi-stage training with different schedules per stage
- Combining incompatible schedules that can't run simultaneously
Algorithm:
- Each epoch, determine which scheduler is active based on milestones
- Call step() on the active scheduler only
- Milestones[i] determines when to switch to scheduler[i+1]
- One scheduler active: Only one scheduler applies per epoch (unlike ChainedScheduler).
- Milestones required: Must specify exactly (len(schedulers) - 1) milestones.
- Strictly increasing: Milestones must be in increasing order.
- Common use: Warmup + main schedule is the most frequent pattern.
- Resume training: last_epoch parameter helps resuming from checkpoint.
- Flexible composition: Mix any scheduler types, no restrictions.
- Recommended: Preferred over ChainedScheduler for most use cases.
- Total epochs: Plan total epochs carefully to align with phase boundaries.
Examples
// Standard: Warmup then cosine annealing
const warmup = new torch.optim.LinearLR(optimizer, { total_iters: 10 });
const cosine = new torch.optim.CosineAnnealingLR(optimizer, { T_max: 90 });
const scheduler = new torch.optim.SequentialLR(optimizer, {
schedulers: [warmup, cosine],
milestones: [10] // Switch to cosine at epoch 10
});
for (let epoch = 0; epoch < 100; epoch++) {
train();
scheduler.step();
}
// Epochs 0-9: LinearLR warmup
// Epochs 10-99: CosineAnnealingLR// Constant warmup then step decay
const warmup = new torch.optim.ConstantLR(optimizer, { factor: 0.1, total_iters: 5 });
const decay = new torch.optim.StepLR(optimizer, { step_size: 10, gamma: 0.1 });
const scheduler = new torch.optim.SequentialLR(optimizer, {
schedulers: [warmup, decay],
milestones: [5]
});// Three phases: constant warmup, cosine main, final decay
const phase1 = new torch.optim.ConstantLR(optimizer, { factor: 0.1, total_iters: 10 });
const phase2 = new torch.optim.CosineAnnealingLR(optimizer, { T_max: 70 });
const phase3 = new torch.optim.StepLR(optimizer, { step_size: 5, gamma: 0.5 });
const scheduler = new torch.optim.SequentialLR(optimizer, {
schedulers: [phase1, phase2, phase3],
milestones: [10, 80] // Switch at epoch 10 and 80
});