torch.nn.Dropout1d
new Dropout1d(options?: DropoutOptions)
- readonly
p(number)
Dropout1d: randomly zeros entire channels for 1D sequences/features.
A specialized dropout for 1D sequential data (sequences, time series, features). Instead of dropping individual elements independently (like Dropout), Dropout1d drops entire channels (all timesteps/positions for a channel) together. This preserves spatial correlations within sequences. Essential for:
- Recurrent networks (preserves temporal structure)
- 1D convolutional networks (preserves feature channel coherence)
- Time series with temporal dependencies
- Preventing co-adaptation within channels
- Maintaining feature correlations along the sequence
Dropout1d treats each channel as a unit: if a channel is dropped, ALL values across that channel (all time steps) are zeroed. This is more appropriate for sequential data where correlations exist across time within each feature channel.
When to use Dropout1d:
- 1D Conv layers (drop entire feature maps)
- RNN hidden states (drop entire features across sequence)
- Time series predictions (preserve temporal structure)
- Feature extraction from sequences
- When features have meaning across time/sequence dimensions
Why channel-wise dropout:
- Dropout: Drops x[t, c] independently for each position t and channel c
- Dropout1d: Drops x[:, c] together (entire channel c across all positions)
- Result: Preserves within-channel temporal correlations while regularizing feature selection
Trade-offs:
- vs Dropout: Channel-wise is more structured, preserves sequence correlations
- vs Dropout: Channel-wise is stronger regularization (more parameters dropped together)
- Channel structure: Assumes features have meaningful patterns within channels
- Sequence preservation: Keeps temporal structure intact
Input shape expectations:
- 3D tensor: (batch, channels, length) for Conv1d
- 3D tensor: (batch, length, features) for sequential data
Dropout1d mechanics: For input shape (N, C, L) where C is channels and L is sequence length:
- Create channel mask M ~ Bernoulli(1-p) of shape (N, C, 1)
- Broadcast mask to (N, C, L): expand single value across sequence
- Apply: y = M ⊙ x / (1-p) (entire channels zeroed or kept together)
- Channel-wise: Entire channel dropped together (not element-wise like Dropout)
- Structured sparsity: Creates block patterns, not random patterns
- Temporal preservation: Keeps patterns along sequence dimension intact
- Feature selection: Acts as stochastic feature selection at channel level
- Conv1d compatible: Designed for Conv1d which operates on channels
- Shape sensitive: Assumes input is (batch, channels, length) or (batch, length, channels)
- Stronger regularization: Channel dropout stronger than element-wise
- Training/inference: Must call .train()/.eval() to control behavior
Examples
// Dropout in 1D Conv network
const dropout = new torch.nn.Dropout1d(0.5);
const x = torch.randn([32, 64, 128]); // Batch=32, channels=64, length=128
// During training: ~50% of the 64 channels are completely dropped
dropout.train();
const train_out = dropout.forward(x); // Shape [32, 64, 128], some channels are zero
// During inference: no dropout
dropout.eval();
const test_out = dropout.forward(x); // No dropout, returns x// RNN with channel dropout
class RNNWithDropout extends torch.nn.Module {
conv1: torch.nn.Conv1d;
dropout1: torch.nn.Dropout1d;
conv2: torch.nn.Conv1d;
dropout2: torch.nn.Dropout1d;
constructor() {
super();
this.conv1 = new torch.nn.Conv1d(1, 64, 3, { padding: 1 });
this.dropout1 = new torch.nn.Dropout1d(0.5);
this.conv2 = new torch.nn.Conv1d(64, 32, 3, { padding: 1 });
this.dropout2 = new torch.nn.Dropout1d(0.5);
}
forward(x: torch.Tensor): torch.Tensor {
x = torch.relu(this.conv1.forward(x));
x = this.dropout1.forward(x); // Drop entire feature maps
x = torch.relu(this.conv2.forward(x));
x = this.dropout2.forward(x);
return x;
}
}// Time series with channel dropout
const dropout = new torch.nn.Dropout1d(0.3); // 30% channel dropout
const time_series = torch.randn([16, 50, 200]); // Batch=16, features=50, time=200
dropout.train();
const regularized = dropout.forward(time_series);
// Preserves temporal structure within channels, drops entire feature channels