torch.js has not been released yet.

torch.nn.functional.mse_loss

function mse_loss(input: Tensor, target: Tensor): Tensor

function mse_loss(input: Tensor, target: Tensor, size_average: boolean | null, reduce: boolean | null, reduction: 'none' | 'mean' | 'sum', options: MseLossFunctionalOptions): Tensor

Mean Squared Error (MSE) Loss: standard regression loss function.

Computes the average squared difference between predictions and targets. The quadratic penalty heavily penalizes large errors, making MSE sensitive to outliers but enabling efficient optimization. Essential for:

Regression tasks (predicting continuous values: prices, temperatures, distances)
Reconstruction tasks (autoencoders, denoising)
Pixel-level predictions (depth, segmentation, image-to-image)
Bounding box regression in object detection
Time series forecasting
Loss function in most optimization problems due to mathematical properties

When to use MSE Loss:

Standard choice for regression (default if unsure)
When large errors should be heavily penalized
When you want Gaussian noise assumption for targets
For optimization convenience (smooth, well-behaved gradients)
When outliers are rare and acceptable to penalize heavily

Trade-offs vs L1 Loss:

Robustness: MSE penalizes outliers quadratically (sensitive), L1 linearly (robust)
Smoothness: MSE smooth everywhere (better for optimization), L1 has kink at 0
Gradient magnitude: MSE gradients grow with error (can explode), L1 constant
Empirical: MSE usually better if outliers rare; L1/Huber better with outliers
Computational: Both similar cost, MSE slightly faster

Most common: Default choice for regression tasks
Smooth optimization: Well-behaved gradients help convergence
Outlier sensitive: Quadratic penalty heavily weights large errors
Gaussian assumption: Assumes Gaussian noise on targets
Scale dependent: MSE sensitive to magnitude of values (consider normalization)

Outlier sensitivity: Large errors penalized quadratically; may cause numerical issues
Scale sensitivity: Should normalize/standardize targets to same scale

Parameters

inputTensor: Predicted values (any shape)
targetTensor: Target values (same shape as input)

Returns

Tensor– Scalar loss value (or tensor if reduction='none')

Examples

// Simple regression
const predictions = torch.tensor([1.0, 2.0, 3.0, 4.0]);
const targets = torch.tensor([1.1, 2.2, 2.8, 4.1]);

const loss = torch.nn.functional.mse_loss(predictions, targets);
// loss = mean([0.01, 0.04, 0.04, 0.01]) = 0.025

// Neural network regression
const model = new torch.nn.Linear(10, 1);
const optimizer = new torch.optim.SGD(model.parameters(), 0.01);

for (let epoch = 0; epoch < 100; epoch++) {
  const x = torch.randn([32, 10]);
  const y = torch.randn([32, 1]);

  const pred = model.forward(x);
  const loss = torch.nn.functional.mse_loss(pred, y);

  // loss.backward();
  // optimizer.step();
}

// Autoencoder reconstruction loss
const reconstructed = autoencoder.forward(x);
const reconstruction_loss = torch.nn.functional.mse_loss(reconstructed, x);

See Also

PyTorch torch.nn.functional.mse_loss
l1_loss - Robust alternative with linear penalty
smooth_l1_loss - Hybrid of L1 and MSE (best of both worlds)
huber_loss - Alias for SmoothL1Loss

MaxUnpoolFunctionalOptions

MseLossFunctionalOptions