torch.nn.InstanceNorm1d
class InstanceNorm1d extends _InstanceNormInstance Normalization for 1D inputs: normalizes each sample independently per channel.
Normalizes across spatial/temporal dimensions within each sample, independent of batch. Unlike BatchNorm which normalizes across batch samples, InstanceNorm normalizes each sample independently. Essential for:
- Style transfer networks (preserves content while removing style)
- Generative models (GANs, VAEs with instance-level normalization)
- Domain adaptation (DANN, ADDA)
- When batch statistics unreliable or unwanted (small batches, single sample inference)
- Online learning and streaming scenarios
When to use InstanceNorm1d:
- Style transfer where instance-specific statistics should be normalized
- Generative models sensitive to instance-level normalization
- Small batch sizes (batch size 1-4) where BatchNorm struggles
- Single-sample inference where batch statistics don't apply
- Online/streaming processing without batch accumulation
Difference from BatchNorm:
- BatchNorm: normalizes across batch dimension (uses batch statistics)
- InstanceNorm: normalizes within each sample independently (uses per-sample statistics)
- Result: No dependency on batch composition or batch size
- Per-sample normalization: Each sample's channels normalized independently
- Batch-size invariant: No dependency on batch composition
- Small batch friendly: Works with batch size 1 (unlike BatchNorm which needs larger batches)
- Default no affine: affine=false is default (most common), different from BatchNorm
- No running stats: Doesn't track running mean/variance across batches
- Single-sample inference: Perfect for online learning and streaming
- Loses batch information: Doesn't use batch statistics (may reduce information for some tasks)
- Less stabilizing: Doesn't provide same training stabilization as BatchNorm
- May hurt accuracy: For some tasks, BatchNorm outperforms InstanceNorm
Examples
// Style transfer with instance normalization
const norm = new torch.nn.InstanceNorm1d(64); // 64 channels
const x = torch.randn([32, 64, 100]); // [batch, channels, time]
const y = norm.forward(x); // Each sample normalized independently// Batch-size independent training (small batch or single sample)
const norm = new torch.nn.InstanceNorm1d(128);
// Works with any batch size including 1
const x_batch_1 = torch.randn([1, 128, 50]);
const y1 = norm.forward(x_batch_1); // Single sample works perfectly
const x_batch_32 = torch.randn([32, 128, 50]);
const y32 = norm.forward(x_batch_32); // Batch of 32 also works the same way// With learnable affine transform
const norm = new torch.nn.InstanceNorm1d(256, 1e-5, 0.1, true); // affine=true
const x = torch.randn([16, 256, 100]);
const y = norm.forward(x); // Normalized + affine transform