torch.js has not been released yet.

torch.nn.InstanceNorm1d

class InstanceNorm1d extends _InstanceNorm

Instance Normalization for 1D inputs: normalizes each sample independently per channel.

Normalizes across spatial/temporal dimensions within each sample, independent of batch. Unlike BatchNorm which normalizes across batch samples, InstanceNorm normalizes each sample independently. Essential for:

Style transfer networks (preserves content while removing style)
Generative models (GANs, VAEs with instance-level normalization)
Domain adaptation (DANN, ADDA)
When batch statistics unreliable or unwanted (small batches, single sample inference)
Online learning and streaming scenarios

When to use InstanceNorm1d:

Style transfer where instance-specific statistics should be normalized
Generative models sensitive to instance-level normalization
Small batch sizes (batch size 1-4) where BatchNorm struggles
Single-sample inference where batch statistics don't apply
Online/streaming processing without batch accumulation

Difference from BatchNorm:

BatchNorm: normalizes across batch dimension (uses batch statistics)
InstanceNorm: normalizes within each sample independently (uses per-sample statistics)
Result: No dependency on batch composition or batch size

\begin{aligned} \mu_{n,c} = \frac{1}{L} \sum_{l=1}^{L} x[n,c,l], \quad \sigma_{n,c}^2 = \frac{1}{L} \sum_{l=1}^{L} (x[n,c,l] - \mu_{n,c})^2 \text{ (per sample, per channel)} \\ x_{\text{norm}} = \frac{x - \mu_{n,c}}{\sqrt{\sigma_{n,c}^2 + \epsilon}}, \quad y = \gamma x_{\text{norm}} + \beta \end{aligned}

Per-sample normalization: Each sample's channels normalized independently
Batch-size invariant: No dependency on batch composition
Small batch friendly: Works with batch size 1 (unlike BatchNorm which needs larger batches)
Default no affine: affine=false is default (most common), different from BatchNorm
No running stats: Doesn't track running mean/variance across batches
Single-sample inference: Perfect for online learning and streaming

Loses batch information: Doesn't use batch statistics (may reduce information for some tasks)
Less stabilizing: Doesn't provide same training stabilization as BatchNorm
May hurt accuracy: For some tasks, BatchNorm outperforms InstanceNorm

Examples

// Style transfer with instance normalization
const norm = new torch.nn.InstanceNorm1d(64);  // 64 channels
const x = torch.randn([32, 64, 100]);  // [batch, channels, time]
const y = norm.forward(x);  // Each sample normalized independently

// Batch-size independent training (small batch or single sample)
const norm = new torch.nn.InstanceNorm1d(128);
// Works with any batch size including 1
const x_batch_1 = torch.randn([1, 128, 50]);
const y1 = norm.forward(x_batch_1);  // Single sample works perfectly
const x_batch_32 = torch.randn([32, 128, 50]);
const y32 = norm.forward(x_batch_32);  // Batch of 32 also works the same way

// With learnable affine transform
const norm = new torch.nn.InstanceNorm1d(256, 1e-5, 0.1, true);  // affine=true
const x = torch.randn([16, 256, 100]);
const y = norm.forward(x);  // Normalized + affine transform

See Also

PyTorch torch.nn.InstanceNorm1d