torch.nn.functional.max_pool2d
function max_pool2d(input: Tensor, kernel_size: number | [number, number], options?: MaxPool2dFunctionalOptions): Tensorfunction max_pool2d(input: Tensor, kernel_size: number | [number, number], stride: number | [number, number], padding: number | [number, number], dilation: number | [number, number], ceil_mode: boolean, options?: MaxPool2dFunctionalOptions): Tensor2D Max Pooling: downsamples feature maps by taking maximum values.
Applies max pooling over 2D spatial dimensions (height, width) using sliding windows. Takes the maximum value in each window, useful for:
- Feature extraction: preserves strongest activations (edges, textures)
- Dimensionality reduction: reduces spatial dimensions, fewer parameters
- Translation invariance: small translations don't affect max value
- CNNs and computer vision: standard layer in all image recognition networks
- Hierarchical feature learning: coarse features from fine details
When to use Max Pooling:
- CNNs for image classification, detection, segmentation
- Whenever you need spatial downsampling in vision models
- To extract most salient features in local regions
- For reducing computation in deeper layers
- Default choice over average pooling in modern networks
Trade-offs vs Average Pooling:
- Feature selection: Max selects strongest signal (good for detection)
- Robustness: Max more robust to noise (ignores weak activations)
- Information loss: Max discards more information
- Empirical: Max pooling generally better for classification
- Computation: Similar cost, though max requires comparison
- Standard in CNNs: Default pooling in ImageNet, ResNet, VGG, all modern architectures
- Preserves edges: Keeps strongest responses (good for edge/texture detection)
- Reduces computation: Each layer ~4x smaller with 2x2 pooling
- Non-differentiable: Max operation not smooth, but gradients work in practice
- Output size: Depends on kernel_size, stride, and padding (not on input values)
Parameters
inputTensor- 4D input tensor [batch, channels, height, width]
kernel_sizenumber | [number, number]- Size of the pooling window (scalar or [height, width])
optionsMaxPool2dFunctionalOptionsoptional
Returns
Tensor– Pooled output tensor [batch, channels, out_height, out_width]Examples
// Standard CNN max pooling
const x = torch.randn([batch_size, 64, 32, 32]); // After conv layer
const pooled = torch.nn.functional.max_pool2d(x, 2); // 2x2 pooling
// Output: [batch_size, 64, 16, 16] - spatial dims halved// ResNet-style with 7x7 max pool
const x = torch.randn([1, 3, 224, 224]);
const pooled = torch.nn.functional.max_pool2d(x, 7, 1, 3); // 7x7 kernel, stride=1, pad=3
// Output: [1, 3, 224, 224] - same spatial size// VGG-style architecture
class VGGBlock extends torch.nn.Module {
private conv1: torch.nn.Conv2d;
private conv2: torch.nn.Conv2d;
forward(x: torch.Tensor): torch.Tensor {
x = torch.nn.functional.relu(this.conv1.forward(x));
x = torch.nn.functional.relu(this.conv2.forward(x));
return torch.nn.functional.max_pool2d(x, 2); // 2x2 pooling
}
}See Also
- PyTorch torch.nn.functional.max_pool2d
- avg_pool2d - Average pooling alternative (smoother but weaker features)
- adaptive_max_pool2d - Adaptive pooling to fixed output size
- max_pool1d - 1D variant for sequences
- max_pool3d - 3D variant for volumetric data