torch.nn.Dropout1d

class Dropout1d extends Module

new Dropout1d(options?: DropoutOptions)

readonlyp(number)

Dropout1d: randomly zeros entire channels for 1D sequences/features.

A specialized dropout for 1D sequential data (sequences, time series, features). Instead of dropping individual elements independently (like Dropout), Dropout1d drops entire channels (all timesteps/positions for a channel) together. This preserves spatial correlations within sequences. Essential for:

Recurrent networks (preserves temporal structure)
1D convolutional networks (preserves feature channel coherence)
Time series with temporal dependencies
Preventing co-adaptation within channels
Maintaining feature correlations along the sequence

Dropout1d treats each channel as a unit: if a channel is dropped, ALL values across that channel (all time steps) are zeroed. This is more appropriate for sequential data where correlations exist across time within each feature channel.

When to use Dropout1d:

1D Conv layers (drop entire feature maps)
RNN hidden states (drop entire features across sequence)
Time series predictions (preserve temporal structure)
Feature extraction from sequences
When features have meaning across time/sequence dimensions

Why channel-wise dropout:

Dropout: Drops x[t, c] independently for each position t and channel c
Dropout1d: Drops x[:, c] together (entire channel c across all positions)
Result: Preserves within-channel temporal correlations while regularizing feature selection

Trade-offs:

vs Dropout: Channel-wise is more structured, preserves sequence correlations
vs Dropout: Channel-wise is stronger regularization (more parameters dropped together)
Channel structure: Assumes features have meaningful patterns within channels
Sequence preservation: Keeps temporal structure intact

Input shape expectations:

3D tensor: (batch, channels, length) for Conv1d
3D tensor: (batch, length, features) for sequential data

Dropout1d mechanics: For input shape (N, C, L) where C is channels and L is sequence length:

Create channel mask M ~ Bernoulli(1-p) of shape (N, C, 1)
Broadcast mask to (N, C, L): expand single value across sequence
Apply: y = M ⊙ x / (1-p) (entire channels zeroed or kept together)

\begin{aligned} y[:, c, :] = \begin{cases} \frac{x[:, c, :]}{1-p} & \text{with probability } (1-p) \\ 0 & \text{with probability } p \end{cases} \text{ for each channel } c \end{aligned}

Channel-wise: Entire channel dropped together (not element-wise like Dropout)
Structured sparsity: Creates block patterns, not random patterns
Temporal preservation: Keeps patterns along sequence dimension intact
Feature selection: Acts as stochastic feature selection at channel level
Conv1d compatible: Designed for Conv1d which operates on channels

Shape sensitive: Assumes input is (batch, channels, length) or (batch, length, channels)
Stronger regularization: Channel dropout stronger than element-wise
Training/inference: Must call .train()/.eval() to control behavior

Examples

// Dropout in 1D Conv network
const dropout = new torch.nn.Dropout1d(0.5);
const x = torch.randn([32, 64, 128]);  // Batch=32, channels=64, length=128

// During training: ~50% of the 64 channels are completely dropped
dropout.train();
const train_out = dropout.forward(x);  // Shape [32, 64, 128], some channels are zero

// During inference: no dropout
dropout.eval();
const test_out = dropout.forward(x);  // No dropout, returns x

// RNN with channel dropout
class RNNWithDropout extends torch.nn.Module {
  conv1: torch.nn.Conv1d;
  dropout1: torch.nn.Dropout1d;
  conv2: torch.nn.Conv1d;
  dropout2: torch.nn.Dropout1d;

  constructor() {
    super();
    this.conv1 = new torch.nn.Conv1d(1, 64, 3, { padding: 1 });
    this.dropout1 = new torch.nn.Dropout1d(0.5);
    this.conv2 = new torch.nn.Conv1d(64, 32, 3, { padding: 1 });
    this.dropout2 = new torch.nn.Dropout1d(0.5);
  }

  forward(x: torch.Tensor): torch.Tensor {
    x = torch.relu(this.conv1.forward(x));
    x = this.dropout1.forward(x);  // Drop entire feature maps
    x = torch.relu(this.conv2.forward(x));
    x = this.dropout2.forward(x);
    return x;
  }
}

// Time series with channel dropout
const dropout = new torch.nn.Dropout1d(0.3);  // 30% channel dropout
const time_series = torch.randn([16, 50, 200]);  // Batch=16, features=50, time=200
dropout.train();
const regularized = dropout.forward(time_series);
// Preserves temporal structure within channels, drops entire feature channels

torch.nn.Dropout1d

Examples

See Also

torch.nn.Dropout1d

Examples

See Also