torch.nn.AvgPool2d

class AvgPool2d extends Module

new AvgPool2d(kernel_size: number | [number, number], options?: AvgPool2dOptions)

readonlykernel_size(number | [number, number])
readonlystride(number | [number, number])
readonlypadding(number | [number, number])
readonlycount_include_pad(boolean)

2D average pooling: reduces spatial dimensions by taking mean over sliding window.

Applies average pooling over 2D spatial data (images): scans a kernel over height and width, returning the arithmetic mean value within each kernel window. Reduces spatial dimensionality with smooth downsampling. Essential for:

Smooth image downsampling (reduces aliasing compared to max pooling)
Feature map compression (spatial reduction with signal preservation)
Noise reduction in feature maps (averaging smooths noise)
Computational efficiency (reducing spatial dimensions efficiently)
Global average pooling (special case: output_size=1, used in classification heads)

Average pooling smooths spatial information by including all values in the window. Unlike max pooling which keeps only peaks, average pooling preserves overall spatial structure. More conservative downsampling that retains more information.

When to use AvgPool2d:

Images where smooth downsampling is preferred (less aliasing than MaxPool)
Feature aggregation (global pooling before classification)
When all spatial values matter (not just peaks)
Noise reduction in feature maps
Global average pooling heads (output_size=1)

Trade-offs:

vs MaxPool2d: AvgPool2d smooths; MaxPool2d preserves peaks
vs adaptive pooling: AvgPool2d fixed stride/kernel; adaptive auto-adjusts output size
Smoothness: Averaging reduces sharp features
Information loss: Less than MaxPool (all values included)
Gradient flow: All spatial elements contribute to gradients

Pooling mechanics: For a 2D image [B, C, H, W] (batch, channels, height, width):

For each channel independently:
Slide kernel_size × kernel_size window over spatial dimensions
Step by stride in both H and W directions (default: kernel_size for non-overlapping)
Compute mean value in each window
Output: [B, C, H_out, W_out] where:
- H_out = floor((H + 2*padding_h - kernel_h) / stride_h) + 1
- W_out = floor((W + 2*padding_w - kernel_w) / stride_w) + 1

Default stride: stride=kernel_size gives non-overlapping pooling
Stride kernel: Creates overlapping windows (smoother spatial filtering)
Global pooling: kernel=H×W produces [B, C, 1, 1] (commonly used before classification)
Gradient: All spatial elements get gradient (distributed equally)
count_include_pad: True = padded zeros affect average; False = only real values
Information preservation: More data retained than MaxPool (smoother signal)
Smoothing effect: Reduces spatial noise but also blurs sharp features
Channel independence: Each channel pooled independently

Blur: Spatial averaging can blur sharp feature boundaries
Padding effects: With count_include_pad=true, edge values affected by padding
Output size: Calculate using formula to predict output dimensions

Examples

// Standard 2x2 spatial averaging
const pool = new torch.nn.AvgPool2d(2);  // kernel=2x2, stride=2 (non-overlapping)
const x = torch.randn([32, 64, 224, 224]);  // [batch, channels, height, width]
const y = pool.forward(x);  // [batch, 64, 112, 112] - spatial dims halved, smoothed

// Overlapping spatial averaging
const pool = new torch.nn.AvgPool2d([3, 3], 1, 1);  // kernel=3x3, stride=1, padding=1
const x = torch.randn([32, 128, 56, 56]);
const y = pool.forward(x);  // [32, 128, 56, 56] - smooth spatial filtering

// Global average pooling (common classification head)
const pool = new torch.nn.AvgPool2d([7, 7]);  // Match feature map size
const x = torch.randn([32, 512, 7, 7]);  // Feature maps from conv layers
const y = pool.forward(x);  // [32, 512, 1, 1] - global average per channel

// MobileNet-style global average pooling
const avgpool = new torch.nn.AvgPool2d(7);  // Square kernel = square input
const conv_out = torch.randn([16, 1280, 7, 7]);  // Output of feature extraction
const pooled = avgpool.forward(conv_out);  // [16, 1280, 1, 1]
const flattened = pooled.view(pooled.shape[0], -1);  // [16, 1280]

// count_include_pad=false (exclude padding from average)
const pool = new torch.nn.AvgPool2d(2, 2, 1, false);  // Exclude padded zeros from avg
const x = torch.randn([16, 64, 32, 32]);
const y = pool.forward(x);  // Averages only over real values

torch.nn.AvgPool2d

Examples

See Also

torch.nn.AvgPool2d

Examples

See Also