Skip to main content
torch.js has not been released yet.
torch.js logotorch.js logotorch.js
PlaygroundContact
Login
Documentation
IntroductionType SafetyTensor ExpressionsTensor IndexingEinsumEinopsAutogradTraining a ModelProfiling & MemoryPyTorch MigrationBest PracticesRuntimesPerformancePyTorch CompatibilityBenchmarksDType Coverage
AnnealStrategyChainedSchedulerConstantLRCosineAnnealingLRCosineAnnealingWarmRestartsCyclicLRExponentialLRget_last_lrget_last_lrget_last_lrget_last_lrget_last_lrget_last_lrget_lrget_lrget_lrLambdaLRLinearLRload_state_dictload_state_dictload_state_dictload_state_dictload_state_dictload_state_dictLRLambdaLRSchedulerLRSchedulerOptionsMultiplicativeLRMultiStepLROneCycleLRPlateauModePolynomialLRprint_lrPrintLrOptionsReduceLROnPlateauScaleFnScaleModeSchedulerStateDictSequentialLRstate_dictstate_dictstate_dictstate_dictstate_dictstate_dictstepstepstepstepstepstepstepStepLRStepOptions
AdadeltaAdadeltaOptionsAdafactorAdafactorOptionsAdagradAdagradOptionsAdamAdamaxAdamaxOptionsAdamOptionsAdamWAdamWOptionsadd_param_groupASGDASGDOptionsget_averaged_paramLBFGSLBFGSOptionsLBFGSStepOptionsload_state_dictLoadStateDictPostHookLoadStateDictPreHookMuonMuonOptionsNAdamNAdamOptionsOptimizerOptimizerStateOptimizerStateDictParamGroupRAdamRAdamOptionsregister_load_state_dict_post_hookregister_load_state_dict_pre_hookregister_state_dict_post_hookregister_state_dict_pre_hookregister_step_post_hookregister_step_pre_hookRemovableHandleRMSpropRMSpropOptionsRpropRpropOptionsSGDSGDOptionsSparseAdamSparseAdamOptionsstate_dictstate_dict_asyncStateDictPostHookStateDictPreHookstepstepStepPostHookStepPreHookstepWithClosurezero_grad
absacosacoshAdaptivePool1dShapeAdaptivePool2dShapeaddaddbmmAddbmmOptionsaddcdivAddcdivOptionsaddcmulAddcmulOptionsaddmmAddmmOptionsaddmvAddmvOptionsaddrAddrOptionsadjointallallcloseAllcloseOptionsAlphaBetaOptionsamaxaminaminmaxAminmaxOptionsangleanyapplyOutarangeare_deterministic_algorithms_enabledargmaxargminargsortargwhereas_stridedas_tensorasinasinhAssertNoShapeErrorAssertNotErrorAsStridedOptionsAtat_error_index_out_of_boundsatanatan2atanhatleast_1datleast_2datleast_3dAtShapeautocast_decrement_nestingautocast_increment_nestingautograd_gradient_mismatch_errorautograd_not_registered_errorAutogradConfigAutogradDeviceAutogradDTypeAutogradEntryAutogradHandleAutogradHandleImplAxesRecordBackwardFnbaddbmmBaddbmmOptionsbartlett_windowBaseKernelConfigbatch_dimensions_do_not_match_errorbernoulliBernoulliOptionsBinaryBackwardFnBinaryBroadcastResultBinaryDTypeBinaryKernelConfigCPUBinaryKernelCPUBinaryOpConfigBinaryOpNamesBinaryOpSchemaBinaryOptionsbincountBincountOptionsbitwise_andbitwise_left_shiftbitwise_notbitwise_orbitwise_right_shiftbitwise_xorblackman_windowblock_diagbmmBooleanDTypeRulebroadcast_error_incompatible_dimensionsbroadcast_shapesbroadcast_tensorsbroadcast_toBroadcastShapeBroadcastShapeRulebroadcastShapesbucketizeBucketizeOptionsBufferUsagebuildEinopsErrorbuildErrorMessagecanBroadcastTocartesian_prodcatCatOptionsCatShapeCauchyOptionscdistCdistOptionsceilceluCeluFunctionalOptionschain_matmulCheckShapeErrorCholeskyShapechunkchunk_error_dim_out_of_rangeChunkOptionsclampClampOptionsclear_autocast_cacheclearEinopsCacheclearEinsumCacheclonecolumn_stackcombinationsCombinationsOptionscompiled_with_cxx11_abicomplexconjconj_physicalcontiguousConv1dShapeConv2dShapeConv3dShapeConvTranspose2dShapecopysigncorrcoefcoscoshcount_nonzeroCountNonzeroOptionscovcoverage_reportcoverageReportCoverageReportCovOptionsCPUForwardFnCPUKernelConfigCPUKernelEntryCPUOnlyResultCPUTensorDatacreateCumExtremeResultcreateTorchCreationOpSchemaCumExtremeResultcummaxcummincumprodCumShapecumsumcumulative_trapezoidCumulativeOptionsCumulativeOptionsWithDimdeg2raddetachDeterministicOptionsDetShapeDevicedevice_error_requiresDeviceBufferDeviceCapabilitiesDeviceCheckedResultDeviceConfigDeviceContextDeviceEntryDeviceHandleDeviceInputDeviceOptionsDeviceRegistryDeviceTypediagdiag_embedDiagEmbedOptionsdiagflatDiagflatOptionsDiagFlatOptionsdiagonal_scatterDiagonalOptionsDiagonalScatterOptionsDiagOptionsDiagShapediffDiffOptionsdigammadimension_error_out_of_rangeDispatchConfigdistDistOptionsdivdotDotShapeRuleDoubleDoubleDimdropoutDropoutFunctionalOptionsdsplitdstackDTypedtype_already_registered_errordtype_components_mismatch_errordtype_not_found_errorDTypeComponentsDTypeConfigDTypeCoverageReportDTypeDisplayConfigDTypeEntryDTypeHandleDTypeHandleImplDTypeInfoDTypeRegistryDTypeRuleDTypeSerializationConfigDynamicShapeEigShapeeinops_error_ambiguous_decompositioneinops_error_anonymous_in_outputeinops_error_dimension_mismatcheinops_error_invalid_patterneinops_error_reduce_undefined_outputeinops_error_repeat_missing_sizeeinops_error_undefined_axiseinsumeinsum_error_dimension_mismatcheinsum_error_index_out_of_rangeeinsum_error_invalid_equationeinsum_error_invalid_sublist_elementeinsum_error_operand_count_mismatcheinsum_error_subscript_rank_mismatcheinsum_error_unknown_output_indexEinsumOptionsEinsumOutputShapeEllipsiseluelu_EluFunctionalOptionsembedding_bag_error_requires_2d_inputemptyempty_cacheempty_likeeqequalerferfcerfinvexpexp2expandexpand_asexpand_error_incompatibleExpandShapeexpm1ExponentialOptionseyeEyeOptionsfftFFTOptionsfindKernelWithPredicatefindSimilarPatternsflattenFlattenOptionsFlattenShapeflipflip_error_dim_out_of_rangefliplrFlipShapeflipudfloat_powerFloatDTypeRulefloorfloor_dividefmaxfminfmodformatEquationErrorformatShapefracfrexpfrombufferfullfull_likefunction_already_registered_errorFunctionConfigFunctionEntryFunctionHandlegathergather_error_dim_out_of_rangeGatherShapegcdgegeluGeometricOptionsget_autocast_cpu_dtypeget_autocast_gpu_dtypeget_autocast_ipu_dtypeget_autocast_xla_dtypeget_default_deviceget_default_dtypeget_deterministic_debug_modeget_device_configget_device_contextget_device_moduleget_dtype_infoget_file_pathget_float32_matmul_precisionget_num_interop_threadsget_num_threadsget_op_infoget_printoptionsget_real_dtypeget_rng_stategetAutogradgetDTypegetEinopsCacheSizegetEinsumCacheSizegetFunctiongetKernelgetMethodgetOpInfoGetOpKindGetOpSchemagetScalarKernelgluGluFunctionalOptionsGradContextGradFnGradientsForgtHalfHalfDimhamming_windowhann_windowhardshrinkhardsigmoidhardswishhardtanhhardtanh_HardtanhFunctionalOptionshas_autogradhas_devicehas_dtypehas_kernelhasAutogradhasDTypehasFunctionhasKernelhasMethodhasScalarKernelHasShapeErrorheavisidehistcHistcOptionshistogramHistogramOptionsHistogramResulthsplithstackhypoti0IdentityShapeifftimagindex_addindex_copyindex_fillindex_putindex_reduceindex_selectindex_select_error_dim_out_of_rangeIndexPutOptionsIndexSelectShapeIndexSpecIndicesOptionsIndicesSpecinitialize_deviceInputsForInsertDiminvalid_config_errorinverseInverseShapeirfftis_anomaly_check_nan_enabledis_anomaly_enabledis_autocast_cache_enabledis_autocast_cpu_enabledis_autocast_ipu_enabledis_autocast_xla_enabledis_complexis_complex_dtypeis_cpu_only_modeis_deterministic_algorithms_warn_only_enabledis_floating_pointis_floating_point_dtypeis_inference_mode_enabledis_nonzerois_tensoris_warn_always_enabledis_webgpu_availableIs2DIsAtLeast1DIsBinaryOpIsBinaryOpNameiscloseIscloseOptionsisfiniteisinisinfisnanisneginfisposinfisrealIsReductionOpIsReductionOpNameIsRegistryErrorIsShapeErroristftISTFTOptionsIsUnaryOpIsUnaryOpNameitem_error_not_scalarItemResultkaiser_windowKaiserWindowOptionskernel_not_registered_errorkernel_signature_mismatch_errorKernelConfigKernelConfigWebGPUKernelEntryKernelHandleKernelInfoKernelPredicateKernelRegistryKernelWebGPUkronkthvalueKthvalueOptionslcmldexpleleaky_reluleaky_relu_LeakyReluFunctionalOptionslerplevenshteinDistancelgammalinalg_error_not_square_matrixlinalg_error_requires_2dlinalg_error_requires_at_least_2dlinearlinspacelist_custom_deviceslist_custom_dtypeslist_deviceslist_dtypeslist_functionslist_kernelslist_methodslist_opslistCustomDTypeslistDTypeslistFunctionslistKernelsListKernelsOptionslistMethodslistOpsListOpsOptionsloglog_softmaxlog10log1plog2logaddexplogaddexp2logcumsumexplogical_andlogical_notlogical_orlogical_xorLogitOptionsLogNormalOptionsLogOptionslogsigmoidlogspacelogsumexpLogsumexpOptionsltLUShapeLuSolveOptionsmasked_fillmasked_selectmasked_select_asyncMaskSpecmatmulmatmul_error_inner_dimensions_do_not_matchMatmul2DShapeMatmulShapeMatmulShapeRuleMatrixTransposeShapemaxmaximummeanmedianmemory_statsmemory_summarymeshgridmethod_already_registered_errormethod_dtype_not_supported_errorMethodConfigMethodEntryMethodHandleminminimummishmmMMShapeRulemodemovedimmsortmulmultinomialmultinomial_asyncMultinomialAsyncOptionsMultinomialOptionsMultiplyBymvMVShapeRulenan_to_numnanmeannanmediannanquantileNanReductionOptionsnansumNanToNumOptionsnarrownarrow_copynarrow_error_length_exceeds_boundsnarrow_error_start_out_of_boundsNarrowShapeneneedsBroadcastnegNegativeDimnextafternonzeroNonzeroOptionsnormnormalNormalOptionsNormOptionsnumelonesones_likeop_kind_mismatch_errorop_not_found_errorOpCoverageEntryOpInfoOpKindOpNameOpSchemaOpSchemasouterOuterShapepackPackShapepermutepermute_error_dimension_count_mismatchPermuteShapepoissonpolarPool1dShapePool2dShapePool3dShapepositivepowpreluPrintOptionsprodprofiler_allow_cudagraph_cupti_lazy_reinit_cuda12promote_typesPromoteDTypeRulePutOptionsquantileQuantileOptionsrad2degrandrand_likerandintrandint_likeRandintLikeOptionsRandintOptionsrandnrandn_likeRandomLikeOptionsRandomOptionsrandpermRangeSpecRankravelrealrearrangeRearrangeOptionsRearrangeShapereciprocalreduceReduceOperationReduceOptionsReduceShapeReductionKernelConfigCPUReductionKernelCPUReductionOpNamesReductionOpSchemaReductionOptionsReductionShapeRuleregister_backwardregister_deviceregister_dtyperegister_forwardregister_functionregister_methodregister_scalar_forwardregisterAutogradRegisterBackwardOptionsregisterBinaryOpregisterDTypeRegisterDTypeOptionsRegisteredDTyperegisterFunctionRegisterFunctionOptionsregisterKernelRegisterKernelOptionsregisterMethodRegisterMethodOptionsregisterScalarKernelregisterUnaryOpregistration_failed_errorrelurelu_relu6ReluFunctionalOptionsremainderRemoveDimrepeatrepeat_interleaveRepeatInterleaveOptionsRepeatOptionsRepeatShapeReplaceDimrequireWebGPUreset_peak_memory_statsreshapeReshapeShaperesult_typerfftrollRollOptionsrot90Rot90Optionsroundrrelurrelu_RreluFunctionalOptionsrsqrtSafeExpandShapeSameDTypeRuleSameShapeRuleSaveForBackwardScalarCPUForwardFnScalarCPUKernelConfigScalarKernelEntryScalarKernelHandleScalarWebGPUKernelConfigScaleDimscatterscatter_addscatter_add_scatter_error_dim_out_of_rangescatter_reducescatter_reduce_ScatterReduceOptionsScatterShapesearchsortedSearchSortedOptionsselectselect_error_index_out_of_boundsselect_scatterSelectShapeseluset_default_deviceset_default_tensor_typeset_deterministic_debug_modeset_float32_matmul_precisionset_printoptionsset_warn_alwaysSetupContextFnShapeShapeCheckedResultShapedTensorShapeErrorMessageShapeOpSchemaShapeRulesigmoidsignsignbitsilusinsincsinhSizeOptionsslice_error_out_of_boundsslice_scatterSliceOptionsSliceScatterOptionsSliceShapeSliceSpecsoftmaxsoftmax_error_dim_out_of_rangeSoftmaxShapesoftminSoftminFunctionalOptionssoftplusSoftplusFunctionalOptionssoftshrinksoftsignsortSortOptionssplitsplit_error_dim_out_of_rangeSplitOptionssqrtsquaresqueezeSqueezeOptionsSqueezeShapestackStackOptionsStackShapestdstd_meanStdVarMeanOptionsStdVarOptionsstftSTFTOptionsStrideOptionssubSublistSublistElementSubscriptIndexsumSVDShapeswapaxessym_floatsym_intsym_notttaketake_along_dimTakeAlongDimOptionstantanhtanhshrinktensortensor_splitTensorCreatorTensorDatatensordotTensordotOptionsTensorLikeTensorMetaTensorOptionsTensorStoragethresholdthreshold_tileTileShapeToOptionstopkTopkOptionsTorchtraceTraceShapetransposetranspose_dims_error_out_of_rangetranspose_error_requires_2d_tensorTransposeDimsShapeTransposeDimsShapeCheckedTransposeShapetrapezoidTrapezoidOptionsTriangularOptionstriltril_indicesTriOptionsTripletriutriu_indicestrue_dividetruncTupleOfLengthTypedArrayTypedArrayForTypedStorageTypeOptionsUnaryBackwardFnUnaryDTypeUnaryKernelConfigCPUUnaryKernelCPUUnaryOpConfigUnaryOpFnUnaryOpNamesUnaryOpParamsUnaryOpSchemaUnaryOptionsunbindunbind_error_dim_out_of_rangeUnbindOptionsunflattenUniformOptionsuniqueunique_consecutiveUniqueConsecutiveOptionsUniqueOptionsunpackUnpackShapeunravel_indexunregister_deviceunsqueezeUnsqueezeOptionsUnsqueezeShapeuse_deterministic_algorithmsValidateBatchedSquareMatrixValidateChunkDimValidatedEinsumShapevalidateDeviceValidateDeviceValidatedRearrangeShapeValidatedReduceShapeValidatedRepeatShapevalidateDTypeValidateEinsumValidateOperandCountValidateRanksValidateScalarValidateSplitDimValidateSquareMatrixValidateUnbindDimValueOptionsvar_var_meanvdotviewview_as_complexview_as_realvmapvsplitvstackWebGPUKernelConfigWebGPUOnlyResultWebGPUTensorDatawhereWindowOptionsxlogyzeroszeros_like
torch.js· 2026
LegalTerms of UsePrivacy Policy
/
/
  1. docs
  2. torch.js
  3. torch
  4. optim
  5. lr_scheduler
  6. MultiStepLR

torch.optim.lr_scheduler.MultiStepLR

class MultiStepLR extends LRScheduler
new MultiStepLR(optimizer: Optimizer, options: { /** List of epoch indices. Must be increasing. */ milestones: number[]; /** Multiplicative factor of learning rate decay (default: 0.1) */ gamma?: number; /** The index of last epoch (default: -1) */ last_epoch?: number; /** Whether to print a message for each update (default: false) */ verbose?: boolean; })

Constructor Parameters

optimizerOptimizer
Wrapped optimizer
options{ /** List of epoch indices. Must be increasing. */ milestones: number[]; /** Multiplicative factor of learning rate decay (default: 0.1) */ gamma?: number; /** The index of last epoch (default: -1) */ last_epoch?: number; /** Whether to print a message for each update (default: false) */ verbose?: boolean; }
Scheduler options
milestones(Set<number>)
– List of epoch indices to decay LR
gamma(number)
– Multiplicative factor of learning rate decay

MultiStepLR scheduler: Decay learning rate by gamma at specified milestone epochs.

MultiStepLR is a more flexible version of StepLR. Instead of decaying every N epochs, you specify exactly which epochs to decay at. This allows for fine-grained control over the learning rate schedule, perfect for scenarios where you know exactly when to reduce the learning rate.

Key differences from StepLR:

  • StepLR: Regular intervals (every step_size epochs)
  • MultiStepLR: Custom intervals (at specific milestone epochs)
  • MultiStepLR provides more control at cost of manual specification

When to use MultiStepLR:

  • When decay epochs are not evenly spaced
  • Fine-grained control over learning rate decay schedule
  • Multi-phase training with different decay schedules per phase
  • When you know good decay points from prior experiments
  • Classic vision models: often trained with fixed milestones like [30, 60, 90]

Trade-offs:

  • More manual work to specify milestones vs StepLR (automatic intervals)
  • Rigid schedule (like StepLR) - doesn't adapt to training progress
  • Cumulative decay: decays stack (η decreases by gamma each time)

Algorithm: Multiplies learning rate by gamma at each milestone epoch:

  • If epoch is in milestones: η_t *= gamma
  • Example: milestones=[30, 80], gamma=0.1
    • Epochs 0-29: lr = η_0
    • Epoch 30: lr *= 0.1 = 0.1 * η_0
    • Epochs 31-79: lr = 0.1 * η_0
    • Epoch 80: lr *= 0.1 = 0.01 * η_0
    • Epochs 81+: lr = 0.01 * η_0
ηt={ηt−1⋅γif t∈milestonesηt−1otherwiseηt=η0⋅γk where k=number of milestones reached by epoch t\begin{aligned} \eta_t = \begin{cases} \eta_{t-1} \cdot \gamma & \text{if } t \in \text{milestones} \\ \eta_{t-1} & \text{otherwise} \end{cases} \\ \eta_t = \eta_0 \cdot \gamma^k \text{ where } k = \text{number of milestones reached by epoch } t \end{aligned}ηt​={ηt−1​⋅γηt−1​​if t∈milestonesotherwise​ηt​=η0​⋅γk where k=number of milestones reached by epoch t​
  • Explicit milestones: Clear control over exactly when decay happens.
  • Order matters: Milestones must be in increasing order.
  • Cumulative decay: Multiple milestones compound (e.g., [30, 60] → 0.1 × 0.1 = 0.01 at epoch 60).
  • Common pattern: [30, 60, 90] or [30, 80] work well for many vision tasks.
  • Comparison: MultiStepLR is StepLR with custom intervals instead of regular spacing.
  • Rigid schedule: Doesn't adapt to actual training progress.
  • Validation agnostic: Doesn't use validation metrics (see ReduceOnPlateau).
  • Popular in CV: Standard choice for computer vision before cosine annealing became common.

Examples

// Decay at specific epochs: 30, 80
const scheduler = new torch.optim.MultiStepLR(optimizer, {
  milestones: [30, 80],
  gamma: 0.1
});
// Decay schedule: 0.1 at epoch 30, 0.01 at epoch 80
// Different decay pattern: decay more frequently early, less later
const scheduler = new torch.optim.MultiStepLR(optimizer, {
  milestones: [10, 20, 40, 80],
  gamma: 0.5  // Less aggressive decay
});
// Decreases: 0.5, 0.25, 0.125, 0.0625 at each milestone
// Resume from checkpoint with correct milestones
const checkpoint = load_checkpoint('model.pth');
const scheduler = new torch.optim.MultiStepLR(optimizer, {
  milestones: [30, 60, 90],
  gamma: 0.1,
  last_epoch: checkpoint.epoch - 1
});
// Multi-phase training with different milestones
// Phase 1: Quick training with frequent decay
let scheduler = new torch.optim.MultiStepLR(optimizer, {
  milestones: [5, 10, 15],
  gamma: 0.5,
  last_epoch: -1
});
for (let epoch = 0; epoch < 20; epoch++) { train(); scheduler.step(); }

// Phase 2: Fine-tuning with sparser decay
scheduler = new torch.optim.MultiStepLR(optimizer, {
  milestones: [35, 55],  // Actual epochs would be 20+35=55, 20+55=75
  gamma: 0.5,
  last_epoch: 19  // Continue from where we left off
});
for (let epoch = 20; epoch < 60; epoch++) { train(); scheduler.step(); }

See Also

  • PyTorch torch.optim.lr_scheduler.MultiStepLR
Previous
MultiplicativeLR
Next
OneCycleLR