torch.Tensor.Tensor.hardswish
Hard Swish activation function.
Efficient approximation of SiLU (Swish) using hard sigmoid instead of smooth sigmoid. Replaces the expensive exp(x) in sigmoid with piecewise linear clipping. Originally introduced in MobileNetV3 to speed up inference on mobile devices.
Definition: HardSwish(x) = x * HardSigmoid(x), where HardSigmoid uses efficient clipping. Combines the benefits of SiLU's smooth gating with sigmoid's computational efficiency.
Use Cases:
- Mobile inference (MobileNetV3, EfficientNet)
- Edge devices requiring sub-millisecond latency
- Real-time inference in resource-constrained environments
- On-device ML (phones, IoT, embedded systems)
- Gating mechanisms when speed is critical
- 10-100x faster than SiLU (no exponential computation).
- Self-gating: Like Swish, gates input by activation value.
- Output range: [-3, +∞) - unbounded like SiLU.
- Negative values: Allows negative outputs unlike hardsigmoid.
- MobileNet: Official activation for MobileNetV3 and derivatives.
- Smooth behavior: Provides smooth activation within [-3, 3] range.
- Gradient is zero for x ≤ -3 (dead neurons possible).
- Large positive values not saturated (unlike sigmoid).
- Not recommended for server-side deep models (use SiLU instead).
Returns
Tensor<S, D, Dev>– Tensor with same shape as inputExamples
// Basic usage - efficient SiLU approximation
const x = torch.tensor([-4, -2, 0, 2, 4]);
x.hardswish(); // [0, ~-0.333, 0, ~0.667, 4]
// MobileNetV3 backbone - ultra-fast inference
const mobilenet_v3 = MobileNetV3(); // Uses hardswish internally
const image = torch.randn(1, 224, 224, 3);
const features = mobilenet_v3.forward(image);
// Edge device latency-critical inference
const tflite_model = torch.load('model.tflite'); // Converted to hardswish
const latency_critical = true; // Must respond in < 50ms
// Gating with speed requirements
const input = torch.randn(16, 512);
const gate = input.hardswish(); // Fast gating without exp
// Speed comparison
const silu_time = benchmark(() => input.silu()); // ~100µs
const hardswish_time = benchmark(() => input.hardswish()); // ~10µs (10x faster)See Also
- PyTorch torch.nn.functional.hardswish()
- silu - Smooth version (Swish): x * sigmoid(x)
- hardsigmoid - Hard sigmoid (used internally)
- relu - Simpler alternative
- gelu - Better quality (slower)