torch.Tensor.Tensor.sigmoid
Sigmoid activation function.
Computes σ(x) = 1 / (1 + e^(-x)). Output is bounded to (0, 1), making it ideal for probability outputs. One of the oldest and most fundamental activation functions.
Use Cases:
- Binary classification output layer
- Probability modeling
- Gating mechanisms in neural networks (LSTM gates)
- Output normalization to (0, 1) range
- Range: Output is always in (0, 1)
- Gradient: Gradient is highest at x=0, decreases for extreme values
- Saturation: For large |x|, gradient → 0 (can cause vanishing gradients)
- Smooth: Continuously differentiable and smooth
- Vanishing gradient: For |x| 5, gradient becomes very small
- Not zero-centered: Output is centered around 0.5, not 0 (slower convergence)
Returns
Tensor<S, D, Dev>– New tensor with sigmoid applied element-wiseExamples
// Binary classification output
const logits = torch.randn(100);
const probabilities = logits.sigmoid(); // [0, 1] range
// LSTM gate (sigmoid gate)
const hidden = torch.randn(32, 128);
const forget_gate = hidden.matmul(weight).sigmoid(); // [0, 1] gate
// Probability thresholding
const scores = torch.randn(1000);
const threshold = 0.5;
const predictions = scores.sigmoid().gt(threshold).long();See Also
- PyTorch tensor.sigmoid()
- tanh - Similar but zero-centered, range [-1, 1]
- relu - Modern alternative: faster and no saturation
- softmax - Multi-class version (sum to 1)