Skip to main content
torch.js has not been released yet.
torch.js logotorch.js logotorch.js
PlaygroundContact
Login
Documentation
IntroductionType SafetyTensor ExpressionsTensor IndexingEinsumEinopsAutogradTraining a ModelProfiling & MemoryPyTorch MigrationBest PracticesRuntimesPerformancePyTorch CompatibilityBenchmarksDType Coverage
AnnealStrategyChainedSchedulerConstantLRCosineAnnealingLRCosineAnnealingWarmRestartsCyclicLRExponentialLRget_last_lrget_last_lrget_last_lrget_last_lrget_last_lrget_last_lrget_lrget_lrget_lrLambdaLRLinearLRload_state_dictload_state_dictload_state_dictload_state_dictload_state_dictload_state_dictLRLambdaLRSchedulerLRSchedulerOptionsMultiplicativeLRMultiStepLROneCycleLRPlateauModePolynomialLRprint_lrPrintLrOptionsReduceLROnPlateauScaleFnScaleModeSchedulerStateDictSequentialLRstate_dictstate_dictstate_dictstate_dictstate_dictstate_dictstepstepstepstepstepstepstepStepLRStepOptions
AdadeltaAdadeltaOptionsAdafactorAdafactorOptionsAdagradAdagradOptionsAdamAdamaxAdamaxOptionsAdamOptionsAdamWAdamWOptionsadd_param_groupASGDASGDOptionsget_averaged_paramLBFGSLBFGSOptionsLBFGSStepOptionsload_state_dictLoadStateDictPostHookLoadStateDictPreHookMuonMuonOptionsNAdamNAdamOptionsOptimizerOptimizerStateOptimizerStateDictParamGroupRAdamRAdamOptionsregister_load_state_dict_post_hookregister_load_state_dict_pre_hookregister_state_dict_post_hookregister_state_dict_pre_hookregister_step_post_hookregister_step_pre_hookRemovableHandleRMSpropRMSpropOptionsRpropRpropOptionsSGDSGDOptionsSparseAdamSparseAdamOptionsstate_dictstate_dict_asyncStateDictPostHookStateDictPreHookstepstepStepPostHookStepPreHookstepWithClosurezero_grad
absacosacoshAdaptivePool1dShapeAdaptivePool2dShapeaddaddbmmAddbmmOptionsaddcdivAddcdivOptionsaddcmulAddcmulOptionsaddmmAddmmOptionsaddmvAddmvOptionsaddrAddrOptionsadjointallallcloseAllcloseOptionsAlphaBetaOptionsamaxaminaminmaxAminmaxOptionsangleanyapplyOutarangeare_deterministic_algorithms_enabledargmaxargminargsortargwhereas_stridedas_tensorasinasinhAssertNoShapeErrorAssertNotErrorAsStridedOptionsAtat_error_index_out_of_boundsatanatan2atanhatleast_1datleast_2datleast_3dAtShapeautocast_decrement_nestingautocast_increment_nestingautograd_gradient_mismatch_errorautograd_not_registered_errorAutogradConfigAutogradDeviceAutogradDTypeAutogradEntryAutogradHandleAutogradHandleImplAxesRecordBackwardFnbaddbmmBaddbmmOptionsbartlett_windowBaseKernelConfigbatch_dimensions_do_not_match_errorbernoulliBernoulliOptionsBinaryBackwardFnBinaryBroadcastResultBinaryDTypeBinaryKernelConfigCPUBinaryKernelCPUBinaryOpConfigBinaryOpNamesBinaryOpSchemaBinaryOptionsbincountBincountOptionsbitwise_andbitwise_left_shiftbitwise_notbitwise_orbitwise_right_shiftbitwise_xorblackman_windowblock_diagbmmBooleanDTypeRulebroadcast_error_incompatible_dimensionsbroadcast_shapesbroadcast_tensorsbroadcast_toBroadcastShapeBroadcastShapeRulebroadcastShapesbucketizeBucketizeOptionsBufferUsagebuildEinopsErrorbuildErrorMessagecanBroadcastTocartesian_prodcatCatOptionsCatShapeCauchyOptionscdistCdistOptionsceilceluCeluFunctionalOptionschain_matmulCheckShapeErrorCholeskyShapechunkchunk_error_dim_out_of_rangeChunkOptionsclampClampOptionsclear_autocast_cacheclearEinopsCacheclearEinsumCacheclonecolumn_stackcombinationsCombinationsOptionscompiled_with_cxx11_abicomplexconjconj_physicalcontiguousConv1dShapeConv2dShapeConv3dShapeConvTranspose2dShapecopysigncorrcoefcoscoshcount_nonzeroCountNonzeroOptionscovcoverage_reportcoverageReportCoverageReportCovOptionsCPUForwardFnCPUKernelConfigCPUKernelEntryCPUOnlyResultCPUTensorDatacreateCumExtremeResultcreateTorchCreationOpSchemaCumExtremeResultcummaxcummincumprodCumShapecumsumcumulative_trapezoidCumulativeOptionsCumulativeOptionsWithDimdeg2raddetachDeterministicOptionsDetShapeDevicedevice_error_requiresDeviceBufferDeviceCapabilitiesDeviceCheckedResultDeviceConfigDeviceContextDeviceEntryDeviceHandleDeviceInputDeviceOptionsDeviceRegistryDeviceTypediagdiag_embedDiagEmbedOptionsdiagflatDiagflatOptionsDiagFlatOptionsdiagonal_scatterDiagonalOptionsDiagonalScatterOptionsDiagOptionsDiagShapediffDiffOptionsdigammadimension_error_out_of_rangeDispatchConfigdistDistOptionsdivdotDotShapeRuleDoubleDoubleDimdropoutDropoutFunctionalOptionsdsplitdstackDTypedtype_already_registered_errordtype_components_mismatch_errordtype_not_found_errorDTypeComponentsDTypeConfigDTypeCoverageReportDTypeDisplayConfigDTypeEntryDTypeHandleDTypeHandleImplDTypeInfoDTypeRegistryDTypeRuleDTypeSerializationConfigDynamicShapeEigShapeeinops_error_ambiguous_decompositioneinops_error_anonymous_in_outputeinops_error_dimension_mismatcheinops_error_invalid_patterneinops_error_reduce_undefined_outputeinops_error_repeat_missing_sizeeinops_error_undefined_axiseinsumeinsum_error_dimension_mismatcheinsum_error_index_out_of_rangeeinsum_error_invalid_equationeinsum_error_invalid_sublist_elementeinsum_error_operand_count_mismatcheinsum_error_subscript_rank_mismatcheinsum_error_unknown_output_indexEinsumOptionsEinsumOutputShapeEllipsiseluelu_EluFunctionalOptionsembedding_bag_error_requires_2d_inputemptyempty_cacheempty_likeeqequalerferfcerfinvexpexp2expandexpand_asexpand_error_incompatibleExpandShapeexpm1ExponentialOptionseyeEyeOptionsfftFFTOptionsfindKernelWithPredicatefindSimilarPatternsflattenFlattenOptionsFlattenShapeflipflip_error_dim_out_of_rangefliplrFlipShapeflipudfloat_powerFloatDTypeRulefloorfloor_dividefmaxfminfmodformatEquationErrorformatShapefracfrexpfrombufferfullfull_likefunction_already_registered_errorFunctionConfigFunctionEntryFunctionHandlegathergather_error_dim_out_of_rangeGatherShapegcdgegeluGeometricOptionsget_autocast_cpu_dtypeget_autocast_gpu_dtypeget_autocast_ipu_dtypeget_autocast_xla_dtypeget_default_deviceget_default_dtypeget_deterministic_debug_modeget_device_configget_device_contextget_device_moduleget_dtype_infoget_file_pathget_float32_matmul_precisionget_num_interop_threadsget_num_threadsget_op_infoget_printoptionsget_real_dtypeget_rng_stategetAutogradgetDTypegetEinopsCacheSizegetEinsumCacheSizegetFunctiongetKernelgetMethodgetOpInfoGetOpKindGetOpSchemagetScalarKernelgluGluFunctionalOptionsGradContextGradFnGradientsForgtHalfHalfDimhamming_windowhann_windowhardshrinkhardsigmoidhardswishhardtanhhardtanh_HardtanhFunctionalOptionshas_autogradhas_devicehas_dtypehas_kernelhasAutogradhasDTypehasFunctionhasKernelhasMethodhasScalarKernelHasShapeErrorheavisidehistcHistcOptionshistogramHistogramOptionsHistogramResulthsplithstackhypoti0IdentityShapeifftimagindex_addindex_copyindex_fillindex_putindex_reduceindex_selectindex_select_error_dim_out_of_rangeIndexPutOptionsIndexSelectShapeIndexSpecIndicesOptionsIndicesSpecinitialize_deviceInputsForInsertDiminvalid_config_errorinverseInverseShapeirfftis_anomaly_check_nan_enabledis_anomaly_enabledis_autocast_cache_enabledis_autocast_cpu_enabledis_autocast_ipu_enabledis_autocast_xla_enabledis_complexis_complex_dtypeis_cpu_only_modeis_deterministic_algorithms_warn_only_enabledis_floating_pointis_floating_point_dtypeis_inference_mode_enabledis_nonzerois_tensoris_warn_always_enabledis_webgpu_availableIs2DIsAtLeast1DIsBinaryOpIsBinaryOpNameiscloseIscloseOptionsisfiniteisinisinfisnanisneginfisposinfisrealIsReductionOpIsReductionOpNameIsRegistryErrorIsShapeErroristftISTFTOptionsIsUnaryOpIsUnaryOpNameitem_error_not_scalarItemResultkaiser_windowKaiserWindowOptionskernel_not_registered_errorkernel_signature_mismatch_errorKernelConfigKernelConfigWebGPUKernelEntryKernelHandleKernelInfoKernelPredicateKernelRegistryKernelWebGPUkronkthvalueKthvalueOptionslcmldexpleleaky_reluleaky_relu_LeakyReluFunctionalOptionslerplevenshteinDistancelgammalinalg_error_not_square_matrixlinalg_error_requires_2dlinalg_error_requires_at_least_2dlinearlinspacelist_custom_deviceslist_custom_dtypeslist_deviceslist_dtypeslist_functionslist_kernelslist_methodslist_opslistCustomDTypeslistDTypeslistFunctionslistKernelsListKernelsOptionslistMethodslistOpsListOpsOptionsloglog_softmaxlog10log1plog2logaddexplogaddexp2logcumsumexplogical_andlogical_notlogical_orlogical_xorLogitOptionsLogNormalOptionsLogOptionslogsigmoidlogspacelogsumexpLogsumexpOptionsltLUShapeLuSolveOptionsmasked_fillmasked_selectmasked_select_asyncMaskSpecmatmulmatmul_error_inner_dimensions_do_not_matchMatmul2DShapeMatmulShapeMatmulShapeRuleMatrixTransposeShapemaxmaximummeanmedianmemory_statsmemory_summarymeshgridmethod_already_registered_errormethod_dtype_not_supported_errorMethodConfigMethodEntryMethodHandleminminimummishmmMMShapeRulemodemovedimmsortmulmultinomialmultinomial_asyncMultinomialAsyncOptionsMultinomialOptionsMultiplyBymvMVShapeRulenan_to_numnanmeannanmediannanquantileNanReductionOptionsnansumNanToNumOptionsnarrownarrow_copynarrow_error_length_exceeds_boundsnarrow_error_start_out_of_boundsNarrowShapeneneedsBroadcastnegNegativeDimnextafternonzeroNonzeroOptionsnormnormalNormalOptionsNormOptionsnumelonesones_likeop_kind_mismatch_errorop_not_found_errorOpCoverageEntryOpInfoOpKindOpNameOpSchemaOpSchemasouterOuterShapepackPackShapepermutepermute_error_dimension_count_mismatchPermuteShapepoissonpolarPool1dShapePool2dShapePool3dShapepositivepowpreluPrintOptionsprodprofiler_allow_cudagraph_cupti_lazy_reinit_cuda12promote_typesPromoteDTypeRulePutOptionsquantileQuantileOptionsrad2degrandrand_likerandintrandint_likeRandintLikeOptionsRandintOptionsrandnrandn_likeRandomLikeOptionsRandomOptionsrandpermRangeSpecRankravelrealrearrangeRearrangeOptionsRearrangeShapereciprocalreduceReduceOperationReduceOptionsReduceShapeReductionKernelConfigCPUReductionKernelCPUReductionOpNamesReductionOpSchemaReductionOptionsReductionShapeRuleregister_backwardregister_deviceregister_dtyperegister_forwardregister_functionregister_methodregister_scalar_forwardregisterAutogradRegisterBackwardOptionsregisterBinaryOpregisterDTypeRegisterDTypeOptionsRegisteredDTyperegisterFunctionRegisterFunctionOptionsregisterKernelRegisterKernelOptionsregisterMethodRegisterMethodOptionsregisterScalarKernelregisterUnaryOpregistration_failed_errorrelurelu_relu6ReluFunctionalOptionsremainderRemoveDimrepeatrepeat_interleaveRepeatInterleaveOptionsRepeatOptionsRepeatShapeReplaceDimrequireWebGPUreset_peak_memory_statsreshapeReshapeShaperesult_typerfftrollRollOptionsrot90Rot90Optionsroundrrelurrelu_RreluFunctionalOptionsrsqrtSafeExpandShapeSameDTypeRuleSameShapeRuleSaveForBackwardScalarCPUForwardFnScalarCPUKernelConfigScalarKernelEntryScalarKernelHandleScalarWebGPUKernelConfigScaleDimscatterscatter_addscatter_add_scatter_error_dim_out_of_rangescatter_reducescatter_reduce_ScatterReduceOptionsScatterShapesearchsortedSearchSortedOptionsselectselect_error_index_out_of_boundsselect_scatterSelectShapeseluset_default_deviceset_default_tensor_typeset_deterministic_debug_modeset_float32_matmul_precisionset_printoptionsset_warn_alwaysSetupContextFnShapeShapeCheckedResultShapedTensorShapeErrorMessageShapeOpSchemaShapeRulesigmoidsignsignbitsilusinsincsinhSizeOptionsslice_error_out_of_boundsslice_scatterSliceOptionsSliceScatterOptionsSliceShapeSliceSpecsoftmaxsoftmax_error_dim_out_of_rangeSoftmaxShapesoftminSoftminFunctionalOptionssoftplusSoftplusFunctionalOptionssoftshrinksoftsignsortSortOptionssplitsplit_error_dim_out_of_rangeSplitOptionssqrtsquaresqueezeSqueezeOptionsSqueezeShapestackStackOptionsStackShapestdstd_meanStdVarMeanOptionsStdVarOptionsstftSTFTOptionsStrideOptionssubSublistSublistElementSubscriptIndexsumSVDShapeswapaxessym_floatsym_intsym_notttaketake_along_dimTakeAlongDimOptionstantanhtanhshrinktensortensor_splitTensorCreatorTensorDatatensordotTensordotOptionsTensorLikeTensorMetaTensorOptionsTensorStoragethresholdthreshold_tileTileShapeToOptionstopkTopkOptionsTorchtraceTraceShapetransposetranspose_dims_error_out_of_rangetranspose_error_requires_2d_tensorTransposeDimsShapeTransposeDimsShapeCheckedTransposeShapetrapezoidTrapezoidOptionsTriangularOptionstriltril_indicesTriOptionsTripletriutriu_indicestrue_dividetruncTupleOfLengthTypedArrayTypedArrayForTypedStorageTypeOptionsUnaryBackwardFnUnaryDTypeUnaryKernelConfigCPUUnaryKernelCPUUnaryOpConfigUnaryOpFnUnaryOpNamesUnaryOpParamsUnaryOpSchemaUnaryOptionsunbindunbind_error_dim_out_of_rangeUnbindOptionsunflattenUniformOptionsuniqueunique_consecutiveUniqueConsecutiveOptionsUniqueOptionsunpackUnpackShapeunravel_indexunregister_deviceunsqueezeUnsqueezeOptionsUnsqueezeShapeuse_deterministic_algorithmsValidateBatchedSquareMatrixValidateChunkDimValidatedEinsumShapevalidateDeviceValidateDeviceValidatedRearrangeShapeValidatedReduceShapeValidatedRepeatShapevalidateDTypeValidateEinsumValidateOperandCountValidateRanksValidateScalarValidateSplitDimValidateSquareMatrixValidateUnbindDimValueOptionsvar_var_meanvdotviewview_as_complexview_as_realvmapvsplitvstackWebGPUKernelConfigWebGPUOnlyResultWebGPUTensorDatawhereWindowOptionsxlogyzeroszeros_like
torch.js· 2026
LegalTerms of UsePrivacy Policy
/
/
  1. docs
  2. torch.js
  3. torch
  4. optim
  5. lr_scheduler
  6. OneCycleLR

torch.optim.lr_scheduler.OneCycleLR

class OneCycleLR
new OneCycleLR(optimizer: Optimizer, options: { /** Maximum learning rate (or array for each param group) */ max_lr: number | number[]; /** Total number of training steps */ total_steps?: number; /** Total number of epochs (alternative to total_steps) */ epochs?: number; /** Number of steps per epoch (required if using epochs) */ steps_per_epoch?: number; /** Percentage of cycle spent increasing LR (default: 0.3) */ pct_start?: number; /** Anneal strategy: 'cos' or 'linear' (default: 'cos') */ anneal_strategy?: AnnealStrategy; /** Whether to cycle momentum inversely to LR (default: true) */ cycle_momentum?: boolean; /** Initial learning rate = max_lr/div_factor (default: 25) */ div_factor?: number; /** Min LR = initial_lr/final_div_factor (default: 1e4) */ final_div_factor?: number; /** Run 3-phase schedule (default: false) */ three_phase?: boolean; /** The index of last epoch (default: -1) */ last_epoch?: number; /** Whether to print a message for each update (default: false) */ verbose?: boolean; })

Constructor Parameters

optimizerOptimizer
Wrapped optimizer
options{ /** Maximum learning rate (or array for each param group) */ max_lr: number | number[]; /** Total number of training steps */ total_steps?: number; /** Total number of epochs (alternative to total_steps) */ epochs?: number; /** Number of steps per epoch (required if using epochs) */ steps_per_epoch?: number; /** Percentage of cycle spent increasing LR (default: 0.3) */ pct_start?: number; /** Anneal strategy: 'cos' or 'linear' (default: 'cos') */ anneal_strategy?: AnnealStrategy; /** Whether to cycle momentum inversely to LR (default: true) */ cycle_momentum?: boolean; /** Initial learning rate = max_lr/div_factor (default: 25) */ div_factor?: number; /** Min LR = initial_lr/final_div_factor (default: 1e4) */ final_div_factor?: number; /** Run 3-phase schedule (default: false) */ three_phase?: boolean; /** The index of last epoch (default: -1) */ last_epoch?: number; /** Whether to print a message for each update (default: false) */ verbose?: boolean; }
Scheduler options
optimizer(Optimizer)
– The optimizer being scheduled
max_lrs(number[])
– Maximum learning rates
total_steps(number)
– Total number of training steps
pct_start(number)
– Percentage of cycle spent increasing LR
anneal_strategy(AnnealStrategy)
– Anneal strategy
div_factor(number)
– Determines initial LR as max_lr/div_factor
final_div_factor(number)
– Determines minimum LR as initial_lr/final_div_factor
three_phase(boolean)
– Run 3-phase schedule if true
last_epoch(number)
– Last epoch

OneCycleLR scheduler: 1cycle learning rate policy for superconvergence.

OneCycleLR implements the "1cycle" learning rate policy (Smith & Topin, 2019). Instead of monotonically decaying learning rate, it cycles the lr from a low value up to a maximum, then back down to a very low value. This single cycle over the full training duration often produces better accuracy and faster convergence than traditional schedules.

Key insight:

  • Start with low lr, ramp up to max_lr (first ~30% of training)
  • Then ramp down to very low lr (remaining ~70% of training)
  • This forcing through different lr regimes helps escape local minima
  • Often achieves better final accuracy with shorter training time

When to use OneCycleLR:

  • When you know total training steps/epochs in advance
  • Want faster convergence with potential accuracy gains
  • Fixed training duration (like fastai implementations)
  • Research or competitive scenarios (Kaggle, etc)
  • Models that benefit from "aggressive" learning rate schedules

Trade-offs:

  • Requires knowing total_steps in advance (like CosineAnnealingLR)
  • More complex than simple fixed-schedule approaches
  • Momentum scheduling may affect reproducibility
  • Requires tuning of max_lr, pct_start, anneal_strategy
  • Step-based (per-batch) not epoch-based, different semantics

Algorithm: Two-phase learning rate cycling:

  1. Ascent phase (first pct_start): Linear or cosine increase from base_lr to max_lr
  2. Descent phase (remaining): Linear or cosine decrease from max_lr to min_lr

Momentum also cycles (inverse of lr) for better optimization dynamics.

p=steptotal_stepsAscent:  η=ηmin⁡+(ηmax⁡−ηmin⁡)⋅{1−cos⁡(πp/pstart)2cosineppstartlinearDescent:  η=ηmax⁡+(ηmin⁡−ηmax⁡)⋅{1−cos⁡(π+π(p−pstart)/(1−pstart))2cosinep−pstart1−pstartlinearηinitial=ηmax⁡div_factor,ηmin⁡=ηinitialfinal_div_factor\begin{aligned} p = \frac{\text{step}}{\text{total\_steps}} \\ \text{Ascent:} \; \eta = \eta_{\min} + (\eta_{\max} - \eta_{\min}) \cdot \begin{cases} \frac{1 - \cos(\pi p / p_{\text{start}})}{2} & \text{cosine} \\ \frac{p}{p_{\text{start}}} & \text{linear} \end{cases} \\ \text{Descent:} \; \eta = \eta_{\max} + (\eta_{\min} - \eta_{\max}) \cdot \begin{cases} \frac{1 - \cos(\pi + \pi(p - p_{\text{start}})/(1 - p_{\text{start}}))}{2} & \text{cosine} \\ \frac{p - p_{\text{start}}}{1 - p_{\text{start}}} & \text{linear} \end{cases} \\ \eta_{\text{initial}} = \frac{\eta_{\max}}{\text{div\_factor}}, \quad \eta_{\min} = \frac{\eta_{\text{initial}}}{\text{final\_div\_factor}} \end{aligned}p=total_stepsstep​Ascent:η=ηmin​+(ηmax​−ηmin​)⋅{21−cos(πp/pstart​)​pstart​p​​cosinelinear​Descent:η=ηmax​+(ηmin​−ηmax​)⋅{21−cos(π+π(p−pstart​)/(1−pstart​))​1−pstart​p−pstart​​​cosinelinear​ηinitial​=div_factorηmax​​,ηmin​=final_div_factorηinitial​​​
  • Step-based: Calls step() per batch, not per epoch (unlike CosineAnnealingLR).
  • Total steps critical: Must specify exact total_steps (epochs × batches_per_epoch).
  • Faster convergence: Often achieves better results with shorter training time.
  • Empirically strong: Proposed in "A disciplined approach to neural network training" (Smith & Topin).
  • Momentum cycling: Default cycles momentum (inverse of lr) for better dynamics.
  • pct_start controls shape: Higher pct_start = longer ascent, longer descent phase.
  • Anneal strategy: 'cos' usually slightly better than 'linear', but difference small.
  • div_factor controls start: initial_lr = max_lr / div_factor, adjust for warmup speed.
  • final_div_factor controls end: min_lr = max_lr / final_div_factor, typically 100.
  • Not for early stopping: Designed for fixed-duration training (can't easily early stop).

Examples

// Basic OneCycleLR for 1000 steps, max lr 0.1
const scheduler = new torch.optim.OneCycleLR(optimizer, {
  max_lr: 0.1,
  total_steps: 1000  // e.g., 10 epochs × 100 batches
});

for (const batch of dataloader) {
  // train step
  scheduler.step();
}
// Cosine annealing with custom pct_start
const scheduler = new torch.optim.OneCycleLR(optimizer, {
  max_lr: 0.1,
  total_steps: 1000,
  pct_start: 0.4,          // Spend 40% ascending, 60% descending
  anneal_strategy: 'cos'   // Cosine annealing
});
// Linear annealing with longer ascent phase
const scheduler = new torch.optim.OneCycleLR(optimizer, {
  max_lr: 0.1,
  total_steps: 5000,
  pct_start: 0.5,              // Spend half the time ascending
  anneal_strategy: 'linear',  // Linear annealing
  div_factor: 10              // Start at 0.01 (0.1/10)
});
// Without momentum cycling (just lr cycling)
const scheduler = new torch.optim.OneCycleLR(optimizer, {
  max_lr: 0.1,
  total_steps: 1000,
  cycle_momentum: false  // Only cycle lr, not momentum
});

See Also

  • PyTorch torch.optim.lr_scheduler.OneCycleLR
Previous
MultiStepLR
Next
OneCycleLR.get_last_lr