torch.js has not been released yet.

torch.nn.functional.avg_pool3d

function avg_pool3d(input: Tensor, kernel_size: number | [number, number, number], options?: AvgPool3dFunctionalOptions): Tensor

function avg_pool3d(input: Tensor, kernel_size: number | [number, number, number], stride: number | [number, number, number] | null, padding: number | [number, number, number], ceil_mode: boolean, count_include_pad: boolean, divisor_override: number | undefined, options?: AvgPool3dFunctionalOptions): Tensor

3D Average Pooling: downsamples volumetric data by averaging values.

Applies average pooling over 3D spatial dimensions (depth, height, width) using sliding windows. Computes the mean value in each window, useful for:

Medical imaging: smoothing CT/MRI scans while reducing resolution
Video processing: temporal-spatial averaging for feature aggregation
3D feature aggregation: combining neighboring activations in volumetric networks
Noise reduction: averaging reduces noise more effectively than max pooling
Global feature extraction: combining 3D features before classification
Smoothing volumetric data: low-pass filtering effect in 3D space

Unlike max pooling which preserves peaks, average pooling smooths volumetric data. Operates on 5D inputs: (batch, channels, depth, height, width). The count_include_pad parameter controls whether padding is counted in the averaging.

Smoothing effect: Average pooling acts like low-pass filtering in 3D
count_include_pad impact: Affects boundary values near padding
Gradient distribution: Gradients spread equally to all elements in window
Noise reduction: Better than max for noise filtering applications
Signal preservation: Average better preserves overall signal magnitude than max

Boundary handling: With count_include_pad=true, padded regions reduce averages
Memory usage: 3D averaging is computationally expensive for large volumes
Information loss: Averaging may blur fine details in volumetric data

Parameters

inputTensor: 5D input tensor of shape (batch, channels, depth, height, width)
kernel_sizenumber | [number, number, number]: Size of pooling window: single value or [depth, height, width]
optionsAvgPool3dFunctionalOptionsoptional

Returns

Tensor– Tensor with shape (batch, channels, out_depth, out_height, out_width) where: out_depth = floor((depth + 2*pad_d - kernel_d) / stride_d) + 1 out_height = floor((height + 2*pad_h - kernel_h) / stride_h) + 1 out_width = floor((width + 2*pad_w - kernel_w) / stride_w) + 1

Examples

// Medical imaging: smooth and downsample MRI volume
const mri = torch.randn(1, 32, 128, 256, 256);  // 1 scan, 32 filters, 128x256x256 volume
const smoothed = torch.nn.functional.avg_pool3d(mri, 2);  // 2x2x2 averaging
// Output: (1, 32, 64, 128, 128) - reduced noise and resolution

// Video feature aggregation: temporal and spatial averaging
const features = torch.randn(8, 256, 8, 14, 14);  // 8 videos, 256 features, 8 frames, 14x14 spatial
const aggregated = torch.nn.functional.avg_pool3d(features, [2, 2, 2], [2, 2, 2]);
// Output: (8, 256, 4, 7, 7) - combined over 2-frame temporal windows

// Global average pooling to output: reduce to single feature vector
const final_features = torch.randn(16, 512, 4, 4, 4);  // 16 batches, 512 channels, 4³ spatial
const global_avg = torch.nn.functional.avg_pool3d(final_features, [4, 4, 4]);
// Output: (16, 512, 1, 1, 1) - global average of each feature map

// Asymmetric pooling: preserve temporal info, reduce spatial
const temporal_data = torch.randn(4, 128, 16, 32, 32);  // temporal depth=16
const spatial_only = torch.nn.functional.avg_pool3d(temporal_data, [1, 2, 2], [1, 2, 2]);
// Output: (4, 128, 16, 16, 16) - no temporal averaging, spatial 2x reduction

See Also

PyTorch torch.nn.functional.avg_pool3d
max_pool3d - Max variant preserving peaks instead of smoothing
avg_pool2d - 2D spatial average pooling
adaptive_avg_pool3d - Adaptive averaging with automatic kernel/stride

AvgPool1dFunctionalOptions