torch.js has not been released yet.

torch.nn.functional.avg_pool2d

function avg_pool2d(input: Tensor, kernel_size: number | [number, number], options?: AvgPool2dFunctionalOptions): Tensor

function avg_pool2d(input: Tensor, kernel_size: number | [number, number], stride: number | [number, number] | null, padding: number | [number, number], ceil_mode: boolean, count_include_pad: boolean, divisor_override: number | undefined, options?: AvgPool2dFunctionalOptions): Tensor

2D Average Pooling: downsamples feature maps by averaging values.

Applies average pooling over 2D spatial dimensions (height, width) using sliding windows. Computes the mean value in each window, useful for:

Smoother downsampling: preserves overall spatial information
Global context: average pooling captures mean features across regions
Dense predictions: less aggressive than max pooling
Some modern architectures: alternative when edge-preservation not critical
Stochastic regularization: randomness in average pooling aids generalization

When to use Average Pooling:

When you need smoother, more global downsampling
For dense prediction tasks (segmentation, depth estimation)
Global average pooling for classification (channel-wise averaging)
When max pooling is too aggressive
Some modern architectures as alternative to max

Trade-offs vs Max Pooling:

Global context: Average captures mean features, better for overall statistics
Feature preservation: Keeps weaker signals (less selective than max)
Noise sensitivity: Average can blur/average out strong features
Empirical: Max pooling usually better for classification
Modern trend: Max pooling still more common in practice

Global average pooling: Alternative to flatten + linear, reduces parameters
Padding sensitivity: count_include_pad affects behavior at image boundaries
Smooth downsampling: Less aggressive than max pooling
Less common: Max pooling more standard in modern CNNs
Differentiable: Average operation is smooth, unlike max

Parameters

inputTensor: 4D input tensor [batch, channels, height, width]
kernel_sizenumber | [number, number]: Size of the pooling window (scalar or [height, width])
optionsAvgPool2dFunctionalOptionsoptional

Returns

Tensor– Pooled output tensor [batch, channels, out_height, out_width]

Examples

// Standard average pooling for downsampling
const x = torch.randn([batch_size, 64, 32, 32]);
const pooled = torch.nn.functional.avg_pool2d(x, 2);  // 2x2 pooling
// Output: [batch_size, 64, 16, 16] - spatial dims halved, values averaged

// Global average pooling for classification
const features = torch.randn([batch_size, 512, 7, 7]);  // After conv layers
const global_avg = torch.nn.functional.avg_pool2d(features, [7, 7]);
// Output: [batch_size, 512, 1, 1] - one average per channel
const flattened = global_avg.reshape([batch_size, 512]);

// Exclude padding from average
const x = torch.randn([1, 3, 10, 10]);
const pooled = torch.nn.functional.avg_pool2d(x, 3, 1, 1, false);
// count_include_pad=false: only averages non-padded regions

See Also

PyTorch torch.nn.functional.avg_pool2d
max_pool2d - Max pooling alternative (more selective)
adaptive_avg_pool2d - Adaptive average pooling to fixed output size
avg_pool1d - 1D variant for sequences
avg_pool3d - 3D variant for volumetric data