torch.nn.HuberLossOptions

Huber loss: smooth hybrid of L1 and MSE loss.

Combines the robustness of L1 loss (insensitive to outliers) with the smoothness of MSE loss (well-behaved gradients everywhere). Uses L2 (squared) penalty for small errors and L1 (absolute) penalty for large errors. Superior to pure L1 or MSE in most practical scenarios.

When to use Huber Loss:

Regression with noisy data or outliers (better than MSE, smoother than L1)
Bounding box regression in object detection (standard for Faster R-CNN, YOLO)
Any regression where occasional large errors should be de-emphasized
When you need both numerical stability and robustness
Robust statistical estimation problems

Trade-offs:

vs MSE: More robust to outliers while maintaining smoothness
vs L1: Smoother gradients (better for optimization), still robust
Hyperparameter: Requires tuning delta (controls switching point between L2 and L1)
Delta tuning: Small delta → more like L1 (robust but sharp); large delta → more like MSE (smooth but sensitive)

Algorithm: For each error e_i = predicted_i - target_i:

If |e_i| ≤ delta: loss_i = 0.5 * e_i² (quadratic, like MSE)
If |e_i| > delta: loss_i = delta * (|e_i| - 0.5 * delta) (linear, like L1)

The function is continuous and differentiable everywhere, with smooth transition at ±delta.

Definition

export interface HuberLossOptions {
  /** How to reduce loss across batch (default: 'mean') */
  reduction?: Reduction;
  /** Threshold for switching between L1 and L2 behavior (default: 1.0) */
  delta?: number;
}

reduction(Reduction)optional: – How to reduce loss across batch (default: 'mean')
delta(number)optional: – Threshold for switching between L1 and L2 behavior (default: 1.0)

Examples

// Robust regression in object detection (bounding box regression)
const huber_loss = new torch.nn.HuberLoss('mean', 1.0);  // delta = 1.0

const predicted_boxes = torch.randn([32, 4]);  // 32 boxes with [x, y, w, h]
const target_boxes = torch.randn([32, 4]);

const loss = huber_loss.forward(predicted_boxes, target_boxes);
// Huber loss is standard in object detection (handles occasional bbox outliers)

// Comparing Huber vs L1 vs MSE on regression with outliers
const errors = torch.tensor([
  0.1,   // Small error
  0.2,   // Small error
  -0.15, // Small error
  10.0   // Outlier!
]);

const l1_loss = new torch.nn.L1Loss();
const mse_loss = new torch.nn.MSELoss();
const huber_loss = new torch.nn.HuberLoss('sum', 1.0);  // delta = 1.0

const l1 = l1_loss.forward(errors, torch.zeros_like(errors));
// L1: 0.1 + 0.2 + 0.15 + 10.0 = 10.45 (treats large error linearly)

const mse = mse_loss.forward(errors, torch.zeros_like(errors));
// MSE: 0.01 + 0.04 + 0.0225 + 100.0 = 100.0725 (heavily penalizes outlier)

const huber = huber_loss.forward(errors, torch.zeros_like(errors));
// Huber: best of both worlds, smooth yet robust

// Tuning delta parameter for your problem
const predictions = torch.randn([100, 1]);
const targets = torch.randn([100, 1]);

// Conservative (delta=0.5): More robust to outliers
const conservative = new torch.nn.HuberLoss('mean', 0.5);

// Balanced (delta=1.0): Default, good compromise
const balanced = new torch.nn.HuberLoss('mean', 1.0);

// Sensitive (delta=10.0): Closer to MSE behavior
const sensitive = new torch.nn.HuberLoss('mean', 10.0);

// Choose delta based on expected error distribution in your task

// Object detection training loop
class ObjectDetectionModel extends torch.nn.Module {
  rpn: torch.nn.Module;  // Region proposal network
  // ... other layers

  forward(x: torch.Tensor): torch.Tensor {
    const proposals = this.rpn.forward(x);
    return proposals;
  }
}

const model = new ObjectDetectionModel();
const huber = new torch.nn.HuberLoss('mean', 1.0);

const batch_images = torch.randn([32, 3, 224, 224]);
const predicted_boxes = model.forward(batch_images);
const target_boxes = torch.randn([32, 4]);

const loss = huber.forward(predicted_boxes, target_boxes);
// Huber loss handles occasional bbox regression outliers gracefully