Skip to main content
torch.js has not been released yet.
torch.js logotorch.js logotorch.js
PlaygroundContact
Login
Documentation
IntroductionType SafetyTensor ExpressionsTensor IndexingEinsumEinopsAutogradTraining a ModelProfiling & MemoryPyTorch MigrationBest PracticesRuntimesPerformancePyTorch CompatibilityBenchmarksDType Coverage
AnnealStrategyChainedSchedulerConstantLRCosineAnnealingLRCosineAnnealingWarmRestartsCyclicLRExponentialLRget_last_lrget_last_lrget_last_lrget_last_lrget_last_lrget_last_lrget_lrget_lrget_lrLambdaLRLinearLRload_state_dictload_state_dictload_state_dictload_state_dictload_state_dictload_state_dictLRLambdaLRSchedulerLRSchedulerOptionsMultiplicativeLRMultiStepLROneCycleLRPlateauModePolynomialLRprint_lrPrintLrOptionsReduceLROnPlateauScaleFnScaleModeSchedulerStateDictSequentialLRstate_dictstate_dictstate_dictstate_dictstate_dictstate_dictstepstepstepstepstepstepstepStepLRStepOptions
AdadeltaAdadeltaOptionsAdafactorAdafactorOptionsAdagradAdagradOptionsAdamAdamaxAdamaxOptionsAdamOptionsAdamWAdamWOptionsadd_param_groupASGDASGDOptionsget_averaged_paramLBFGSLBFGSOptionsLBFGSStepOptionsload_state_dictLoadStateDictPostHookLoadStateDictPreHookMuonMuonOptionsNAdamNAdamOptionsOptimizerOptimizerStateOptimizerStateDictParamGroupRAdamRAdamOptionsregister_load_state_dict_post_hookregister_load_state_dict_pre_hookregister_state_dict_post_hookregister_state_dict_pre_hookregister_step_post_hookregister_step_pre_hookRemovableHandleRMSpropRMSpropOptionsRpropRpropOptionsSGDSGDOptionsSparseAdamSparseAdamOptionsstate_dictstate_dict_asyncStateDictPostHookStateDictPreHookstepstepStepPostHookStepPreHookstepWithClosurezero_grad
absacosacoshAdaptivePool1dShapeAdaptivePool2dShapeaddaddbmmAddbmmOptionsaddcdivAddcdivOptionsaddcmulAddcmulOptionsaddmmAddmmOptionsaddmvAddmvOptionsaddrAddrOptionsadjointallallcloseAllcloseOptionsAlphaBetaOptionsamaxaminaminmaxAminmaxOptionsangleanyapplyOutarangeare_deterministic_algorithms_enabledargmaxargminargsortargwhereas_stridedas_tensorasinasinhAssertNoShapeErrorAssertNotErrorAsStridedOptionsAtat_error_index_out_of_boundsatanatan2atanhatleast_1datleast_2datleast_3dAtShapeautocast_decrement_nestingautocast_increment_nestingautograd_gradient_mismatch_errorautograd_not_registered_errorAutogradConfigAutogradDeviceAutogradDTypeAutogradEntryAutogradHandleAutogradHandleImplAxesRecordBackwardFnbaddbmmBaddbmmOptionsbartlett_windowBaseKernelConfigbatch_dimensions_do_not_match_errorbernoulliBernoulliOptionsBinaryBackwardFnBinaryBroadcastResultBinaryDTypeBinaryKernelConfigCPUBinaryKernelCPUBinaryOpConfigBinaryOpNamesBinaryOpSchemaBinaryOptionsbincountBincountOptionsbitwise_andbitwise_left_shiftbitwise_notbitwise_orbitwise_right_shiftbitwise_xorblackman_windowblock_diagbmmBooleanDTypeRulebroadcast_error_incompatible_dimensionsbroadcast_shapesbroadcast_tensorsbroadcast_toBroadcastShapeBroadcastShapeRulebroadcastShapesbucketizeBucketizeOptionsBufferUsagebuildEinopsErrorbuildErrorMessagecanBroadcastTocartesian_prodcatCatOptionsCatShapeCauchyOptionscdistCdistOptionsceilceluCeluFunctionalOptionschain_matmulCheckShapeErrorCholeskyShapechunkchunk_error_dim_out_of_rangeChunkOptionsclampClampOptionsclear_autocast_cacheclearEinopsCacheclearEinsumCacheclonecolumn_stackcombinationsCombinationsOptionscompiled_with_cxx11_abicomplexconjconj_physicalcontiguousConv1dShapeConv2dShapeConv3dShapeConvTranspose2dShapecopysigncorrcoefcoscoshcount_nonzeroCountNonzeroOptionscovcoverage_reportcoverageReportCoverageReportCovOptionsCPUForwardFnCPUKernelConfigCPUKernelEntryCPUOnlyResultCPUTensorDatacreateCumExtremeResultcreateTorchCreationOpSchemaCumExtremeResultcummaxcummincumprodCumShapecumsumcumulative_trapezoidCumulativeOptionsCumulativeOptionsWithDimdeg2raddetachDeterministicOptionsDetShapeDevicedevice_error_requiresDeviceBufferDeviceCapabilitiesDeviceCheckedResultDeviceConfigDeviceContextDeviceEntryDeviceHandleDeviceInputDeviceOptionsDeviceRegistryDeviceTypediagdiag_embedDiagEmbedOptionsdiagflatDiagflatOptionsDiagFlatOptionsdiagonal_scatterDiagonalOptionsDiagonalScatterOptionsDiagOptionsDiagShapediffDiffOptionsdigammadimension_error_out_of_rangeDispatchConfigdistDistOptionsdivdotDotShapeRuleDoubleDoubleDimdropoutDropoutFunctionalOptionsdsplitdstackDTypedtype_already_registered_errordtype_components_mismatch_errordtype_not_found_errorDTypeComponentsDTypeConfigDTypeCoverageReportDTypeDisplayConfigDTypeEntryDTypeHandleDTypeHandleImplDTypeInfoDTypeRegistryDTypeRuleDTypeSerializationConfigDynamicShapeEigShapeeinops_error_ambiguous_decompositioneinops_error_anonymous_in_outputeinops_error_dimension_mismatcheinops_error_invalid_patterneinops_error_reduce_undefined_outputeinops_error_repeat_missing_sizeeinops_error_undefined_axiseinsumeinsum_error_dimension_mismatcheinsum_error_index_out_of_rangeeinsum_error_invalid_equationeinsum_error_invalid_sublist_elementeinsum_error_operand_count_mismatcheinsum_error_subscript_rank_mismatcheinsum_error_unknown_output_indexEinsumOptionsEinsumOutputShapeEllipsiseluelu_EluFunctionalOptionsembedding_bag_error_requires_2d_inputemptyempty_cacheempty_likeeqequalerferfcerfinvexpexp2expandexpand_asexpand_error_incompatibleExpandShapeexpm1ExponentialOptionseyeEyeOptionsfftFFTOptionsfindKernelWithPredicatefindSimilarPatternsflattenFlattenOptionsFlattenShapeflipflip_error_dim_out_of_rangefliplrFlipShapeflipudfloat_powerFloatDTypeRulefloorfloor_dividefmaxfminfmodformatEquationErrorformatShapefracfrexpfrombufferfullfull_likefunction_already_registered_errorFunctionConfigFunctionEntryFunctionHandlegathergather_error_dim_out_of_rangeGatherShapegcdgegeluGeometricOptionsget_autocast_cpu_dtypeget_autocast_gpu_dtypeget_autocast_ipu_dtypeget_autocast_xla_dtypeget_default_deviceget_default_dtypeget_deterministic_debug_modeget_device_configget_device_contextget_device_moduleget_dtype_infoget_file_pathget_float32_matmul_precisionget_num_interop_threadsget_num_threadsget_op_infoget_printoptionsget_real_dtypeget_rng_stategetAutogradgetDTypegetEinopsCacheSizegetEinsumCacheSizegetFunctiongetKernelgetMethodgetOpInfoGetOpKindGetOpSchemagetScalarKernelgluGluFunctionalOptionsGradContextGradFnGradientsForgtHalfHalfDimhamming_windowhann_windowhardshrinkhardsigmoidhardswishhardtanhhardtanh_HardtanhFunctionalOptionshas_autogradhas_devicehas_dtypehas_kernelhasAutogradhasDTypehasFunctionhasKernelhasMethodhasScalarKernelHasShapeErrorheavisidehistcHistcOptionshistogramHistogramOptionsHistogramResulthsplithstackhypoti0IdentityShapeifftimagindex_addindex_copyindex_fillindex_putindex_reduceindex_selectindex_select_error_dim_out_of_rangeIndexPutOptionsIndexSelectShapeIndexSpecIndicesOptionsIndicesSpecinitialize_deviceInputsForInsertDiminvalid_config_errorinverseInverseShapeirfftis_anomaly_check_nan_enabledis_anomaly_enabledis_autocast_cache_enabledis_autocast_cpu_enabledis_autocast_ipu_enabledis_autocast_xla_enabledis_complexis_complex_dtypeis_cpu_only_modeis_deterministic_algorithms_warn_only_enabledis_floating_pointis_floating_point_dtypeis_inference_mode_enabledis_nonzerois_tensoris_warn_always_enabledis_webgpu_availableIs2DIsAtLeast1DIsBinaryOpIsBinaryOpNameiscloseIscloseOptionsisfiniteisinisinfisnanisneginfisposinfisrealIsReductionOpIsReductionOpNameIsRegistryErrorIsShapeErroristftISTFTOptionsIsUnaryOpIsUnaryOpNameitem_error_not_scalarItemResultkaiser_windowKaiserWindowOptionskernel_not_registered_errorkernel_signature_mismatch_errorKernelConfigKernelConfigWebGPUKernelEntryKernelHandleKernelInfoKernelPredicateKernelRegistryKernelWebGPUkronkthvalueKthvalueOptionslcmldexpleleaky_reluleaky_relu_LeakyReluFunctionalOptionslerplevenshteinDistancelgammalinalg_error_not_square_matrixlinalg_error_requires_2dlinalg_error_requires_at_least_2dlinearlinspacelist_custom_deviceslist_custom_dtypeslist_deviceslist_dtypeslist_functionslist_kernelslist_methodslist_opslistCustomDTypeslistDTypeslistFunctionslistKernelsListKernelsOptionslistMethodslistOpsListOpsOptionsloglog_softmaxlog10log1plog2logaddexplogaddexp2logcumsumexplogical_andlogical_notlogical_orlogical_xorLogitOptionsLogNormalOptionsLogOptionslogsigmoidlogspacelogsumexpLogsumexpOptionsltLUShapeLuSolveOptionsmasked_fillmasked_selectmasked_select_asyncMaskSpecmatmulmatmul_error_inner_dimensions_do_not_matchMatmul2DShapeMatmulShapeMatmulShapeRuleMatrixTransposeShapemaxmaximummeanmedianmemory_statsmemory_summarymeshgridmethod_already_registered_errormethod_dtype_not_supported_errorMethodConfigMethodEntryMethodHandleminminimummishmmMMShapeRulemodemovedimmsortmulmultinomialmultinomial_asyncMultinomialAsyncOptionsMultinomialOptionsMultiplyBymvMVShapeRulenan_to_numnanmeannanmediannanquantileNanReductionOptionsnansumNanToNumOptionsnarrownarrow_copynarrow_error_length_exceeds_boundsnarrow_error_start_out_of_boundsNarrowShapeneneedsBroadcastnegNegativeDimnextafternonzeroNonzeroOptionsnormnormalNormalOptionsNormOptionsnumelonesones_likeop_kind_mismatch_errorop_not_found_errorOpCoverageEntryOpInfoOpKindOpNameOpSchemaOpSchemasouterOuterShapepackPackShapepermutepermute_error_dimension_count_mismatchPermuteShapepoissonpolarPool1dShapePool2dShapePool3dShapepositivepowpreluPrintOptionsprodprofiler_allow_cudagraph_cupti_lazy_reinit_cuda12promote_typesPromoteDTypeRulePutOptionsquantileQuantileOptionsrad2degrandrand_likerandintrandint_likeRandintLikeOptionsRandintOptionsrandnrandn_likeRandomLikeOptionsRandomOptionsrandpermRangeSpecRankravelrealrearrangeRearrangeOptionsRearrangeShapereciprocalreduceReduceOperationReduceOptionsReduceShapeReductionKernelConfigCPUReductionKernelCPUReductionOpNamesReductionOpSchemaReductionOptionsReductionShapeRuleregister_backwardregister_deviceregister_dtyperegister_forwardregister_functionregister_methodregister_scalar_forwardregisterAutogradRegisterBackwardOptionsregisterBinaryOpregisterDTypeRegisterDTypeOptionsRegisteredDTyperegisterFunctionRegisterFunctionOptionsregisterKernelRegisterKernelOptionsregisterMethodRegisterMethodOptionsregisterScalarKernelregisterUnaryOpregistration_failed_errorrelurelu_relu6ReluFunctionalOptionsremainderRemoveDimrepeatrepeat_interleaveRepeatInterleaveOptionsRepeatOptionsRepeatShapeReplaceDimrequireWebGPUreset_peak_memory_statsreshapeReshapeShaperesult_typerfftrollRollOptionsrot90Rot90Optionsroundrrelurrelu_RreluFunctionalOptionsrsqrtSafeExpandShapeSameDTypeRuleSameShapeRuleSaveForBackwardScalarCPUForwardFnScalarCPUKernelConfigScalarKernelEntryScalarKernelHandleScalarWebGPUKernelConfigScaleDimscatterscatter_addscatter_add_scatter_error_dim_out_of_rangescatter_reducescatter_reduce_ScatterReduceOptionsScatterShapesearchsortedSearchSortedOptionsselectselect_error_index_out_of_boundsselect_scatterSelectShapeseluset_default_deviceset_default_tensor_typeset_deterministic_debug_modeset_float32_matmul_precisionset_printoptionsset_warn_alwaysSetupContextFnShapeShapeCheckedResultShapedTensorShapeErrorMessageShapeOpSchemaShapeRulesigmoidsignsignbitsilusinsincsinhSizeOptionsslice_error_out_of_boundsslice_scatterSliceOptionsSliceScatterOptionsSliceShapeSliceSpecsoftmaxsoftmax_error_dim_out_of_rangeSoftmaxShapesoftminSoftminFunctionalOptionssoftplusSoftplusFunctionalOptionssoftshrinksoftsignsortSortOptionssplitsplit_error_dim_out_of_rangeSplitOptionssqrtsquaresqueezeSqueezeOptionsSqueezeShapestackStackOptionsStackShapestdstd_meanStdVarMeanOptionsStdVarOptionsstftSTFTOptionsStrideOptionssubSublistSublistElementSubscriptIndexsumSVDShapeswapaxessym_floatsym_intsym_notttaketake_along_dimTakeAlongDimOptionstantanhtanhshrinktensortensor_splitTensorCreatorTensorDatatensordotTensordotOptionsTensorLikeTensorMetaTensorOptionsTensorStoragethresholdthreshold_tileTileShapeToOptionstopkTopkOptionsTorchtraceTraceShapetransposetranspose_dims_error_out_of_rangetranspose_error_requires_2d_tensorTransposeDimsShapeTransposeDimsShapeCheckedTransposeShapetrapezoidTrapezoidOptionsTriangularOptionstriltril_indicesTriOptionsTripletriutriu_indicestrue_dividetruncTupleOfLengthTypedArrayTypedArrayForTypedStorageTypeOptionsUnaryBackwardFnUnaryDTypeUnaryKernelConfigCPUUnaryKernelCPUUnaryOpConfigUnaryOpFnUnaryOpNamesUnaryOpParamsUnaryOpSchemaUnaryOptionsunbindunbind_error_dim_out_of_rangeUnbindOptionsunflattenUniformOptionsuniqueunique_consecutiveUniqueConsecutiveOptionsUniqueOptionsunpackUnpackShapeunravel_indexunregister_deviceunsqueezeUnsqueezeOptionsUnsqueezeShapeuse_deterministic_algorithmsValidateBatchedSquareMatrixValidateChunkDimValidatedEinsumShapevalidateDeviceValidateDeviceValidatedRearrangeShapeValidatedReduceShapeValidatedRepeatShapevalidateDTypeValidateEinsumValidateOperandCountValidateRanksValidateScalarValidateSplitDimValidateSquareMatrixValidateUnbindDimValueOptionsvar_var_meanvdotviewview_as_complexview_as_realvmapvsplitvstackWebGPUKernelConfigWebGPUOnlyResultWebGPUTensorDatawhereWindowOptionsxlogyzeroszeros_like
torch.js· 2026
LegalTerms of UsePrivacy Policy
/
/
  1. docs
  2. torch.js
  3. torch
  4. optim
  5. lr_scheduler
  6. LinearLR

torch.optim.lr_scheduler.LinearLR

class LinearLR extends LRScheduler
new LinearLR(optimizer: Optimizer, options: { /** The number to multiply LR at the start (default: 1/3) */ start_factor?: number; /** The number to multiply LR at the end (default: 1.0) */ end_factor?: number; /** Number of iterations for linear change (default: 5) */ total_iters?: number; /** The index of last epoch (default: -1) */ last_epoch?: number; /** Whether to print a message for each update (default: false) */ verbose?: boolean; } = {})

Constructor Parameters

optimizerOptimizer
Wrapped optimizer
options{ /** The number to multiply LR at the start (default: 1/3) */ start_factor?: number; /** The number to multiply LR at the end (default: 1.0) */ end_factor?: number; /** Number of iterations for linear change (default: 5) */ total_iters?: number; /** The index of last epoch (default: -1) */ last_epoch?: number; /** Whether to print a message for each update (default: false) */ verbose?: boolean; }optional
Scheduler options
start_factor(number)
– Starting multiplicative factor
end_factor(number)
– Ending multiplicative factor
total_iters(number)
– Number of iterations for linear change

LinearLR scheduler: Linear interpolation of learning rate multiplier.

LinearLR linearly interpolates the learning rate multiplier from start_factor to end_factor over total_iters iterations. The most common use is as a warmup phase at the beginning of training, linearly ramping up learning rate from a small value (e.g., 1/3 of base_lr) to full value.

Primary use cases:

  • Warmup phase: Ramp up from small lr to full lr over first N epochs
  • Linear decay: Linearly decay from full lr to zero
  • Chaining: Often combined with CosineAnnealingLR (warmup then cosine)

Why use warmup?

  • Prevents optimization instability at the start of training
  • Allows model to adjust to initial random state gracefully
  • Often improves final convergence and generalization
  • Especially important for transformers (standard practice)

When to use LinearLR:

  • First phase of training (before main schedule)
  • Transformer models (standard: warmup for ~10% of training)
  • When main schedule is CosineAnnealingLR or StepLR
  • Learning rate scheduling for supervised learning

Trade-offs:

  • Simple linear interpolation (no curve fitting)
  • Typically used as one phase in composite schedule
  • Alone (without chaining) linear decay is less common than step/cosine
  • Works best when total_iters is small relative to total training epochs

Algorithm: Linearly interpolates multiplier from start_factor to end_factor:

  • factor_t = start_factor + (end_factor - start_factor) * (t / total_iters)
  • η_t = base_lr * factor_t
  • After total_iters, learning rate remains at end_factor * base_lr
ft=fstart+(fend−fstart)⋅tTηt=ηbase⋅ftAfter T iterations: ηt=ηbase⋅fend (constant)\begin{aligned} f_t = f_{\text{start}} + (f_{\text{end}} - f_{\text{start}}) \cdot \frac{t}{T} \\ \eta_t = \eta_{\text{base}} \cdot f_t \\ \text{After } T \text{ iterations: } \eta_t = \eta_{\text{base}} \cdot f_{\text{end}} \text{ (constant)} \end{aligned}ft​=fstart​+(fend​−fstart​)⋅Tt​ηt​=ηbase​⋅ft​After T iterations: ηt​=ηbase​⋅fend​ (constant)​
  • Warmup standard: LinearLR for warmup is standard in modern transformer training.
  • Warmup benefits: Stabilizes early training, often improves final accuracy by 1-2%.
  • Typical warmup: 10% of total training epochs works well empirically.
  • Chaining: Usually chained with another scheduler (CosineAnnealingLR) for full schedule.
  • Alone uncommon: Pure linear decay is less common than step or cosine decay.
  • Warmup amount: start_factor=0.1 for aggressive warmup, 1/3 for moderate warmup.
  • Parameter groups: Works with different learning rates per parameter group.
  • Composable: Designed to be first phase in SequentialLR or ChainedScheduler.

Examples

// Warmup: linear increase from 1/3 to 1.0 over 5 epochs
const scheduler = new torch.optim.LinearLR(optimizer, {
  start_factor: 1/3,
  end_factor: 1.0,
  total_iters: 5
});
for (let epoch = 0; epoch < 100; epoch++) {
  train();
  validate();
  scheduler.step();
}
// After epoch 5, learning rate stays at base_lr
// Standard warmup for transformers: 10% of total epochs
const total_epochs = 100;
const warmup_epochs = Math.floor(total_epochs * 0.1);
const warmup = new torch.optim.LinearLR(optimizer, {
  start_factor: 0.1,        // Start at 10% of base_lr
  end_factor: 1.0,           // Reach full base_lr
  total_iters: warmup_epochs // e.g., 10 epochs
});
// Warmup + Cosine annealing (common for transformers)
const warmup = new torch.optim.LinearLR(optimizer, {
  start_factor: 0.1,
  total_iters: 10
});

const cosine = new torch.optim.CosineAnnealingLR(optimizer, {
  T_max: 90  // Remaining 90 epochs
});

const scheduler = new torch.optim.SequentialLR(
  optimizer,
  [warmup, cosine],
  [10]  // Switch to cosine after 10 epochs
);
// Linear decay (opposite of warmup)
const scheduler = new torch.optim.LinearLR(optimizer, {
  start_factor: 1.0,  // Start at full lr
  end_factor: 0.0,    // Decay to zero
  total_iters: 50     // Over 50 epochs
});

See Also

  • PyTorch torch.optim.lr_scheduler.LinearLR
Previous
LambdaLR
Next
LRLambda