torch.nn.AvgPool2d
new AvgPool2d(kernel_size: number | [number, number], options?: AvgPool2dOptions)
- readonly
kernel_size(number | [number, number]) - readonly
stride(number | [number, number]) - readonly
padding(number | [number, number]) - readonly
count_include_pad(boolean)
2D average pooling: reduces spatial dimensions by taking mean over sliding window.
Applies average pooling over 2D spatial data (images): scans a kernel over height and width, returning the arithmetic mean value within each kernel window. Reduces spatial dimensionality with smooth downsampling. Essential for:
- Smooth image downsampling (reduces aliasing compared to max pooling)
- Feature map compression (spatial reduction with signal preservation)
- Noise reduction in feature maps (averaging smooths noise)
- Computational efficiency (reducing spatial dimensions efficiently)
- Global average pooling (special case: output_size=1, used in classification heads)
Average pooling smooths spatial information by including all values in the window. Unlike max pooling which keeps only peaks, average pooling preserves overall spatial structure. More conservative downsampling that retains more information.
When to use AvgPool2d:
- Images where smooth downsampling is preferred (less aliasing than MaxPool)
- Feature aggregation (global pooling before classification)
- When all spatial values matter (not just peaks)
- Noise reduction in feature maps
- Global average pooling heads (output_size=1)
Trade-offs:
- vs MaxPool2d: AvgPool2d smooths; MaxPool2d preserves peaks
- vs adaptive pooling: AvgPool2d fixed stride/kernel; adaptive auto-adjusts output size
- Smoothness: Averaging reduces sharp features
- Information loss: Less than MaxPool (all values included)
- Gradient flow: All spatial elements contribute to gradients
Pooling mechanics: For a 2D image [B, C, H, W] (batch, channels, height, width):
- For each channel independently:
- Slide kernel_size × kernel_size window over spatial dimensions
- Step by stride in both H and W directions (default: kernel_size for non-overlapping)
- Compute mean value in each window
- Output: [B, C, H_out, W_out] where:
- H_out = floor((H + 2*padding_h - kernel_h) / stride_h) + 1
- W_out = floor((W + 2*padding_w - kernel_w) / stride_w) + 1
- Default stride: stride=kernel_size gives non-overlapping pooling
- Stride kernel: Creates overlapping windows (smoother spatial filtering)
- Global pooling: kernel=H×W produces [B, C, 1, 1] (commonly used before classification)
- Gradient: All spatial elements get gradient (distributed equally)
- count_include_pad: True = padded zeros affect average; False = only real values
- Information preservation: More data retained than MaxPool (smoother signal)
- Smoothing effect: Reduces spatial noise but also blurs sharp features
- Channel independence: Each channel pooled independently
- Blur: Spatial averaging can blur sharp feature boundaries
- Padding effects: With count_include_pad=true, edge values affected by padding
- Output size: Calculate using formula to predict output dimensions
Examples
// Standard 2x2 spatial averaging
const pool = new torch.nn.AvgPool2d(2); // kernel=2x2, stride=2 (non-overlapping)
const x = torch.randn([32, 64, 224, 224]); // [batch, channels, height, width]
const y = pool.forward(x); // [batch, 64, 112, 112] - spatial dims halved, smoothed// Overlapping spatial averaging
const pool = new torch.nn.AvgPool2d([3, 3], 1, 1); // kernel=3x3, stride=1, padding=1
const x = torch.randn([32, 128, 56, 56]);
const y = pool.forward(x); // [32, 128, 56, 56] - smooth spatial filtering// Global average pooling (common classification head)
const pool = new torch.nn.AvgPool2d([7, 7]); // Match feature map size
const x = torch.randn([32, 512, 7, 7]); // Feature maps from conv layers
const y = pool.forward(x); // [32, 512, 1, 1] - global average per channel// MobileNet-style global average pooling
const avgpool = new torch.nn.AvgPool2d(7); // Square kernel = square input
const conv_out = torch.randn([16, 1280, 7, 7]); // Output of feature extraction
const pooled = avgpool.forward(conv_out); // [16, 1280, 1, 1]
const flattened = pooled.view(pooled.shape[0], -1); // [16, 1280]// count_include_pad=false (exclude padding from average)
const pool = new torch.nn.AvgPool2d(2, 2, 1, false); // Exclude padded zeros from avg
const x = torch.randn([16, 64, 32, 32]);
const y = pool.forward(x); // Averages only over real values