torch.nn.functional.avg_pool1d
function avg_pool1d(input: Tensor, kernel_size: number | [number], options?: AvgPool1dFunctionalOptions): Tensorfunction avg_pool1d(input: Tensor, kernel_size: number | [number], stride: number | null, padding: number, ceil_mode: boolean, count_include_pad: boolean, options?: AvgPool1dFunctionalOptions): Tensor1D Average Pooling: downsamples sequences by averaging values.
Applies average pooling over a 1D sequence dimension (length) using sliding windows. Computes the mean value in each window, useful for:
- Smoothing temporal data: removes noise while preserving trends
- Dimensionality reduction: reduces sequence length with averaging
- Signal conditioning: low-pass filtering effect in time series
- Audio feature aggregation: averaging frame-level features to segment-level
- Text representation: pooling word embeddings for sequence representation
- Noise reduction: averaging helps dampen outliers compared to max pooling
Unlike max pooling which preserves peaks, average pooling smooths the signal by taking mean values. Operates on 3D inputs: (batch, channels, length).
- Smoothing effect: Average pooling acts like a low-pass filter (smooths sharp peaks)
- Handles noise: Better at noise reduction than max pooling
- Gradient flow: Gradients are distributed equally to all elements in window
- Preserves trends: Unlike max, average preserves overall magnitude and direction
- Comparison with max: Use average for noise reduction, max for feature saliency
- Information loss: Averaging discards fine-grained details
- Edge effects: Padding affects boundary behavior (often zero-padded)
- Reduces signal strength: Average of positive values is smaller than max
Parameters
inputTensor- 3D input tensor of shape (batch, channels, length)
kernel_sizenumber | [number]- Size of the pooling window (single value for 1D)
optionsAvgPool1dFunctionalOptionsoptional
Returns
Tensor– Tensor with shape (batch, channels, out_length) where: out_length = floor((length + 2*padding - kernel_size) / stride) + 1Examples
// Simple averaging: smooth and downsample sequence
const seq = torch.randn(8, 64, 100); // Batch of 8, 64 features, length 100
const smoothed = torch.nn.functional.avg_pool1d(seq, 2); // kernel=2, stride=2
// Output shape: (8, 64, 50) - averaged pairs
// Speech processing: pool frame-level features to phoneme features
const frames = torch.randn(1, 80, 300); // 300 frames, 80 MFCC coefficients
const phonemes = torch.nn.functional.avg_pool1d(frames, 10, 10);
// Average 10 consecutive frames → single phoneme feature vector
// ECG signal denoising: smooth cardiac waveform while preserving trend
const ecg = torch.randn(32, 1, 5000); // 32 samples, 1 channel, 5000 points
const smoothed_ecg = torch.nn.functional.avg_pool1d(ecg, 5, 1);
// Small kernel=5, stride=1 gives smooth but detailed reduction
// Document representation: pool word embeddings
const embeddings = torch.randn(16, 300, 100); // 16 docs, 300-dim embeddings, 100 words
const doc_vectors = torch.nn.functional.avg_pool1d(embeddings, 5, 5);
// Group words into 5-word chunks, average each → document segmentsSee Also
- PyTorch torch.nn.functional.avg_pool1d
- max_pool1d - Alternative that preserves peaks instead of smoothing
- avg_pool2d - 2D variant for spatial data
- adaptive_avg_pool1d - Adaptive variant with automatic kernel/stride calculation