torch.js has not been released yet.

torch.nn.init.kaiming_normal_

function kaiming_normal_(tensor: Tensor, options?: KaimingOptions): Tensor

function kaiming_normal_(tensor: Tensor, a: number, mode: FanMode, nonlinearity: Nonlinearity, options?: KaimingOptions): Tensor

Fill tensor with Kaiming (He) normal initialization for ReLU-based networks.

Normal distribution variant of Kaiming initialization. Samples from N(0, std²) where std scales based on fan and activation function. Equivalent to kaiming_uniform_ but with Gaussian distribution instead of uniform. Essential for:

Deep ReLU networks preferring normal distribution
Networks trained with batch normalization (works well together)
Theoretical analysis of initialization scales
When normal distribution is explicitly required or preferred

Also called He initialization (normal variant).

The method is described in "Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification" - He, K. et al. (2015).

\text{std} = \frac{\text{gain}}{\sqrt{\text{fan}}}, \quad N(0, \text{std}^2)

Normal vs Uniform: Kaiming normal and kaiming_uniform_ have same variance, different distribution
Batch Normalization: Kaiming normal works especially well with batch norm
Distribution shape: Normal distribution has heavier tails than uniform
Leaky ReLU slope: Must match the α used during forward pass
In-place operation: Modifies tensor in-place; returns the same tensor
Comparison to uniform: Both equally valid; choose based on downstream assumptions

Parameters

tensorTensor: An n-dimensional Tensor (typically weight matrix from a layer)
optionsKaimingOptionsoptional: Optional settings for Kaiming initialization

Returns

Tensor– The input tensor with Kaiming normal initialization Algorithm: - Values sampled from normal distribution N(0, std²) - std = gain / √fan - gain = √(2 / (1 + α²)) for leaky_relu with slope α - gain = √2 for relu - fan = fan_in or fan_out (chosen by mode parameter)

Examples

// Basic He initialization with normal distribution
const layer = torch.nn.Linear(512, 256);
torch.nn.init.kaiming_normal_(layer.weight, { a: 0, mode: 'fan_in', nonlinearity: 'relu' });
torch.nn.init.zeros_(layer.bias);

// With batch normalization (common combination)
const conv = torch.nn.Conv2d(3, 64, 3, { padding: 1 });
const bn = torch.nn.BatchNorm2d(64);
torch.nn.init.kaiming_normal_(conv.weight, { a: 0, mode: 'fan_out', nonlinearity: 'relu' });
torch.nn.init.zeros_(conv.bias);
torch.nn.init.ones_(bn.weight);
torch.nn.init.zeros_(bn.bias);

// Leaky ReLU with custom negative slope
const layer = torch.nn.Linear(1024, 512);
const alpha = 0.2;
torch.nn.init.kaiming_normal_(layer.weight, { a: alpha, mode: 'fan_in', nonlinearity: 'leaky_relu' });
torch.nn.init.zeros_(layer.bias);

See Also

PyTorch torch.nn.init.kaiming_normal_()
torch.nn.init.kaiming_uniform_ - Kaiming with uniform distribution
torch.nn.init.xavier_normal_ - Xavier initialization with normal distribution
torch.nn.init.calculate_gain - Get gain for specific activation function

InitUniformOptions

kaiming_uniform_