torch.Tensor.Tensor.tanhshrink
Tanh shrink activation function.
Element-wise operation: TanhShrink(x) = x - tanh(x). Subtracts the tanh activation from the original value. Produces small outputs near zero (shrinkage effect) and identity-like behavior for large inputs.
Interesting Properties:
- Near zero: Strong shrinkage (f(0) = 0 - 0 = 0, f'(0) = 0)
- Large positive: Almost identity (e.g., f(10) ≈ 10 - 1 ≈ 9)
- Acts as a regularization: penalizes non-zero values
- Less common than tanh, but useful for specific architectures
Use Cases:
- Regularization that penalizes magnitude while preserving direction
- Sparse activations in autoencoders
- Architectural variations in recurrent networks
- When you want identity for large values but shrinkage for small
- Zero shrinking: At x=0, output is exactly 0 with zero gradient.
- Magnitude shrinkage: |output| |input| everywhere except for large |input|.
- Gradient: Maximum at x=0, becomes identity gradient for large |x|.
- Sparse effect: Encourages sparsity in activations.
- Unusual activation: Less common than tanh, mostly for special cases.
- Gradient is zero at x=0 (potential dead neuron issue).
- Output asymptotically approaches ±∞ (unbounded for large |x|).
- Not recommended for standard hidden layers (use tanh/relu instead).
Returns
Tensor<S, D, Dev>– Tensor with same shape as inputExamples
// Basic usage - shrinkage for small values
const x = torch.tensor([-2, -1, 0, 1, 2]);
x.tanhshrink(); // Approximately [-2, -0.238, 0, 0.238, 2]
// Sparse activation in autoencoder
const encoder_output = torch.randn(32, 256);
const sparse = encoder_output.tanhshrink(); // Shrinks to encourage sparsity
// Compare with identity and tanh
const identity = x; // [x values unchanged]
const tanh_act = x.tanh(); // [Squashed to (-1, 1)]
const shrink = x.tanhshrink(); // [Shrunk for small, identity for large]
// Magnitude penalty - regularization effect
const weights = torch.randn(128, 128);
const regularized = weights.tanhshrink(); // Smaller magnitude overall
// Visualization of the effect
const range = torch.linspace(-4, 4, 100);
const output = range.tanhshrink();
// At x=0: output=0 (strong shrinkage)
// At x=±1: output≈±0.238 (moderate shrinkage)
// At x=±4: output≈±3.93 (almost identity)See Also
- PyTorch torch.nn.functional.tanhshrink()
- tanh - Plain tanh without the shrinkage
- sinh - Hyperbolic sine
- softsign - Similar bounded activation
- threshold - Hard thresholding alternative