torch.nn.Dropout2d

class Dropout2d extends Module

new Dropout2d(options?: DropoutOptions)

readonlyp(number)

Dropout2d: randomly zeros entire feature maps for 2D spatial data (images).

A specialized dropout for 2D spatial data (images, feature maps). Instead of dropping individual pixels, Dropout2d drops entire 2D feature maps (all spatial locations for a channel) together. This preserves spatial structure and feature coherence in convolutional networks. Essential for:

2D convolutional networks (CNNs for images)
Preventing co-adaptation of feature maps
Preserving spatial coherence within feature maps
ImageNet-scale models (ResNet, VGG, etc.)
Avoiding boundary artifacts from independent pixel dropout

Dropout2d treats each feature map (channel) as a unit: if a feature map is dropped, ALL spatial locations (all H × W pixels) for that channel are zeroed. This is appropriate for CNNs where feature maps represent learned spatial patterns.

When to use Dropout2d:

2D convolutional layers (drop entire feature maps)
Image classification networks
Object detection backbones
Semantic segmentation feature extraction
When spatial coherence within features is important

Why channel-wise dropout for images:

Dropout: Drops each pixel x[h, w, c] independently
Dropout2d: Drops x[:, :, c] together (entire feature map c)
Result: Preserves spatial patterns within feature maps while regularizing feature selection
Rationale: Conv filters create meaningful spatial patterns; dropping pixels breaks patterns

Trade-offs:

vs Dropout: Channel-wise preserves spatial structure, more effective for CNNs
vs Dropout: Channel-wise stronger regularization (large spatial regions dropped)
Spatial coherence: Assumes features are coherent across spatial dimensions
Channel assumption: Treats channels as independent units

Input shape expectations:

4D tensor: (batch, channels, height, width) from Conv2d
Standard format for convolutional networks

Dropout2d mechanics: For input shape (N, C, H, W) where C is channels, H/W are spatial dimensions:

Create channel mask M ~ Bernoulli(1-p) of shape (N, C, 1, 1)
Broadcast mask to (N, C, H, W): expand across spatial dimensions
Apply: y = M ⊙ x / (1-p) (entire 2D feature maps zeroed or kept together)

\begin{aligned} y[:, c, :, :] = \begin{cases} \frac{x[:, c, :, :]}{1-p} & \text{with probability } (1-p) \\ 0 & \text{with probability } p \end{cases} \text{ for each channel } c \end{aligned}

Channel-wise: Entire feature map (all H × W) dropped together
Spatial coherence: Preserves spatial patterns within feature maps
Conv2d designed: Works naturally with Conv2d output (B, C, H, W)
Standard in practice: Nearly universal in modern CNNs (ImageNet, etc.)
Feature selection: Acts as stochastic feature selection among learned filters

Shape sensitive: Assumes input is (batch, channels, height, width)
Spatial dimensions: Drops all pixels in feature maps, not partial regions
Training/inference: Must call .train()/.eval() to control behavior
Different masks per batch: Each sample in batch gets independent random mask

Examples

// Dropout in 2D Conv network
const dropout = new torch.nn.Dropout2d(0.5);
const x = torch.randn([32, 64, 224, 224]);  // Batch=32, channels=64, H=224, W=224

// During training: ~50% of the 64 feature maps are completely dropped
dropout.train();
const train_out = dropout.forward(x);  // Shape [32, 64, 224, 224], some feature maps are zero

// During inference: no dropout
dropout.eval();
const test_out = dropout.forward(x);  // No dropout, returns x

// CNN backbone with dropout
class ImageClassifier extends torch.nn.Module {
  conv1: torch.nn.Conv2d;
  dropout1: torch.nn.Dropout2d;
  conv2: torch.nn.Conv2d;
  dropout2: torch.nn.Dropout2d;
  pool: torch.nn.MaxPool2d;
  fc: torch.nn.Linear;

  constructor() {
    super();
    this.conv1 = new torch.nn.Conv2d(3, 64, 3, { padding: 1 });
    this.dropout1 = new torch.nn.Dropout2d(0.3);  // 30% dropout
    this.conv2 = new torch.nn.Conv2d(64, 128, 3, { padding: 1 });
    this.dropout2 = new torch.nn.Dropout2d(0.3);
    this.pool = new torch.nn.MaxPool2d(2);
    this.fc = new torch.nn.Linear(128 * 56 * 56, 1000);
  }

  forward(x: torch.Tensor): torch.Tensor {
    x = torch.relu(this.conv1.forward(x));
    x = this.dropout1.forward(x);  // Drop entire feature maps
    x = torch.relu(this.conv2.forward(x));
    x = this.dropout2.forward(x);
    x = this.pool.forward(x);
    x = x.view([x.shape[0], -1]);
    return this.fc.forward(x);
  }
}

// ResNet-style residual block with dropout
class ResidualBlock extends torch.nn.Module {
  conv1: torch.nn.Conv2d;
  dropout1: torch.nn.Dropout2d;
  conv2: torch.nn.Conv2d;
  dropout2: torch.nn.Dropout2d;

  constructor(channels: number) {
    super();
    this.conv1 = new torch.nn.Conv2d(channels, channels, 3, { padding: 1 });
    this.dropout1 = new torch.nn.Dropout2d(0.2);
    this.conv2 = new torch.nn.Conv2d(channels, channels, 3, { padding: 1 });
    this.dropout2 = new torch.nn.Dropout2d(0.2);
  }

  forward(x: torch.Tensor): torch.Tensor {
    const residual = x;
    let out = torch.relu(this.conv1.forward(x));
    out = this.dropout1.forward(out);
    out = this.conv2.forward(out);
    out = this.dropout2.forward(out);
    return torch.relu(out.add(residual));  // Residual connection
  }
}

torch.nn.Dropout2d

Examples

See Also

torch.nn.Dropout2d

Examples

See Also