torch.Tensor.Tensor.leaky_relu
Tensor.leaky_relu(): Tensor<S, D, Dev>Tensor.leaky_relu(negative_slope: number, inplace: boolean, options: LeakyReluFunctionalOptions): Tensor<S, D, Dev>Leaky ReLU activation function.
Fixes the "dying ReLU" problem by allowing small negative values instead of zero. For positive inputs returns x, for negative inputs returns alpha * x where alpha is a small positive constant (typically 0.01). Allows gradients to flow even for negative inputs.
Motivation:
- Standard ReLU returns exactly 0 for x < 0 (zero gradient, dead neurons)
- Leaky ReLU returns alpha * x for x < 0 (non-zero gradient)
- Prevents dead neuron problem while maintaining ReLU's simplicity
- Simple parameter alpha controls the "leak" amount
When to use vs ReLU:
- Use Leaky ReLU if you observe dead neurons (many zero activations)
- Use standard ReLU for cleaner/sparser representations
- Leaky ReLU is safer for training stability
- Default slope: 0.01 (1% leak) is most common in practice.
- Gradient flow: Non-zero gradient everywhere, prevents dead neurons.
- Sparsity: Less sparse than ReLU but still fairly sparse.
- Computation: Slightly more expensive than ReLU (extra multiplication).
- Parameter: negative_slope is hyperparameter, typically 0.001 to 0.3.
- Symmetry: Not symmetric (different behavior for positive/negative).
- Large slopes ( 0.5) may amplify negative values too much.
- Slope = 1.0 makes it linear (identity function).
- Very small slope ( 0.001) almost equivalent to standard ReLU.
Returns
Tensor<S, D, Dev>– Tensor with same shape and dtype as inputExamples
// Default slope (0.01)
const x = torch.tensor([-2, -1, 0, 1, 2]);
x.leaky_relu(); // [-0.02, -0.01, 0, 1, 2]
// Custom slope - larger leak
const steep = x.leaky_relu(0.2); // [-0.4, -0.2, 0, 1, 2] (20% leak)
// Addressing dead ReLU problem
const hidden = dense_layer(input);
const dead_neurons = hidden.relu().mean(); // How many zeros?
if (dead_neurons < 0.5) {
// Try leaky relu to allow gradients through zeros
const better = hidden.leaky_relu(0.1);
}
// Hyperparameter tuning
const slopes = [0.001, 0.01, 0.1, 0.3];
for (const slope of slopes) {
const model = new Model(input_size, { activation: (x) => x.leaky_relu(slope) });
// Train and evaluate
}
// In neural network layer
class DenseLayer {
forward(x: Tensor) {
const z = this.weight.mm(x).add(this.bias);
return z.leaky_relu(0.01); // Standard slope
}
}See Also
- PyTorch torch.nn.functional.leaky_relu()
- relu - Standard version without leak
- elu - Smooth exponential alternative
- prelu - Parametric ReLU with learnable slope
- gelu - Smoother modern alternative