torch.nn.Hardtanh

class Hardtanh extends Module

new Hardtanh(options?: HardtanhOptions)

readonlymin_val(number)
readonlymax_val(number)

Hardtanh activation function.

Hardtanh applies a hard clipping operation that bounds outputs to a fixed range [min_val, max_val]. It's the hard (piecewise-linear) approximation of tanh that's computationally efficient. Hardtanh is widely used in mobile and quantization-friendly models (MobileNets, SqueezeNet) where bounded activations help with low-precision quantization and reduce memory bandwidth. While not as common in modern large models, it remains the standard activation for mobile neural networks.

Core idea: Hardtanh(x) clamps x to [min_val, max_val]. Unlike smooth activations (Tanh, GELU), this has sharp corners at the boundaries but is much faster to compute. The default range [-1, 1] approximates tanh's output range while using simple clamping.

When to use Hardtanh:

Mobile networks: MobileNetV2/V3 (MobileNet actually uses ReLU6 which is Hardtanh(0, 6))
Quantization: Bounded activations aid low-precision fixed-point arithmetic
Embedded devices: Minimal computation (just clipping, no exp/div like tanh)
Resource constraints: Low latency, low power consumption, small memory footprint
NOT for large models: Use GELU/SiLU for modern transformers and vision models

Relationship to other bounded activations:

Hardtanh(0, 6): Exactly equivalent to ReLU6 (mobile standard)
Hardtanh(-1, 1): Bounded version of Tanh, simpler to compute
Sigmoid/Tanh: Smooth versions (slower, less quantization-friendly)
ReLU: Unbounded (no upper limit), one-sided clipping only

Algorithm: Forward: Hardtanh(x) = min(max(x, min_val), max_val)

Clamps input to the range [min_val, max_val]
Very fast: just two comparisons per element
Default range [-1, 1] approximates tanh's natural range

Backward: ∂Hardtanh(x)/∂x = 1 if min_val < x < max_val, else 0

Flat gradient (zero) outside the active region [min_val, max_val]
This means values outside the range get no gradient (no learning from extreme values)

\begin{aligned} Hardtanh(x) = clamp(x, min_val, max_val) = min(max(x, min_val), max_val) \\ Gradient: ∂Hardtanh(x)/∂x = 1 if min_val < x < max_val, else 0 \end{aligned}

Mobile standard: Hardtanh(0, 6) is ReLU6, the standard in MobileNets.
Fast computation: Just clipping, no transcendental functions (exp, div).
Quantization-friendly: Bounded range aids low-precision fixed-point arithmetic.
Sharp boundaries: Non-smooth corners at min_val and max_val (unlike Tanh).
Flat gradients outside: Values outside [min_val, max_val] don't get gradients (dead zone).

Examples

// Mobile network: Hardtanh(0, 6) is ReLU6
class MobileNetBlock extends torch.nn.Module {
  private conv: torch.nn.Conv2d;
  private hardtanh: torch.nn.Hardtanh;  // Or use ReLU6 directly

  constructor(in_ch: number, out_ch: number) {
    super();
    this.conv = new torch.nn.Conv2d(in_ch, out_ch, 3, { padding: 1 });
    this.hardtanh = new torch.nn.Hardtanh(0, 6);  // Standard mobile activation
  }

  forward(x: torch.Tensor): torch.Tensor {
    x = this.conv.forward(x);
    return this.hardtanh.forward(x);  // Bounded to [0, 6]
  }
}

// Quantization-friendly network: Bounded activations aid low-precision math
const x = torch.randn([batch_size, channels, height, width]);

// With Hardtanh: values confined to [-1, 1], easier to quantize to int8
const hardtanh = new torch.nn.Hardtanh(-1, 1);
const bounded = hardtanh.forward(x);  // All values in [-1, 1]

// With ReLU: values in [0, +∞), harder to represent with fixed-point int8
const relu = new torch.nn.ReLU();
const unbounded = relu.forward(x);  // Can be arbitrarily large

// Comparison: Hardtanh vs Tanh (similar but different)
const x = torch.linspace(-3, 3, 100);

const hardtanh = new torch.nn.Hardtanh(-1, 1);  // Sharp corners at ±1
const tanh = new torch.nn.Tanh();                // Smooth curve

const hard_output = hardtanh.forward(x);  // Step-like: -1 for x<-1, x for -1<x<1, 1 for x>1
const tanh_output = tanh.forward(x);      // Smooth: approaches ±1 asymptotically

// Hardtanh is ~10x faster due to simple clamping vs exp computation

torch.nn.Hardtanh

Examples

See Also

torch.nn.Hardtanh

Examples

See Also