torch.nn.functional.soft_margin_loss

function soft_margin_loss(input: Tensor, target: Tensor, options?: SoftMarginLossFunctionalOptions): Tensor

Soft margin loss for binary classification with logistic regression.

Computes the logistic loss (cross-entropy for ±1 targets) for binary classification. Given raw scores x and labels y ∈ {+1, -1}, minimizes log(1 + exp(-y·x)). Smooth approximation to hinge loss that's differentiable everywhere. Essential for:

Binary classification with margin-like objectives (smoother than hard margin)
Logistic regression generalized to ±1 labels (vs standard 0/1)
Two-class support vector machines (SVM) objectives
Learning with confidence margins (soft instead of hard margin)
Smooth ranking objectives (alternative to hard hinge loss)

Core idea: Minimize log(1 + exp(-y·x)), a smooth version of max(0, 1 - y·x).

When y·x is large and positive: loss ≈ 0 (correct prediction with high confidence)
When y·x near 0: loss ≈ log(2) (uncertain)
When y·x is large and negative: loss ≈ |y·x| (misclassified, penalty grows linearly)

Why "soft"? Smooth sigmoid-like loss that approaches hinge loss asymptotically. Unlike hard-margin hinge loss max(0, 1 - y·x), soft margin is differentiable everywhere.

Connection to logistic regression: Standard logistic: L = log(1 + exp(-y·x)) where y ∈ {0, 1} Soft margin: L = log(1 + exp(-y·x)) where y ∈ {+1, -1} (just a label convention change)

\begin{aligned} L_i = \\log(1 + \\exp(-y_i \\cdot x_i)) \\ L = \\frac{1}{N} \\sum_{i=1}^{N} L_i \\quad (\\text{if reduction='mean'}) \end{aligned}

Smooth approximation: Approximates hard-margin hinge loss with smooth differentiable function
Logistic loss: Equivalent to binary cross-entropy with ±1 labels (vs 0/1)
Always positive: Loss is always ≥ log(2) ≈ 0.693 (even for correct predictions)
Asymptotic: As |y·x| → ∞, loss behaves like |y·x| (similar to hinge loss)
Symmetric: Loss is symmetric in ±1 labels (just negates x)

Label convention: Must use +1/-1 labels, not 0/1 (will give wrong results otherwise)
Unbounded loss: For very incorrect predictions, loss grows without bound
Scale sensitivity: Loss depends on absolute scale of logits; very large/small values matter
Numerical stability: Very large negative y·x can cause exp overflow; usually handled internally

Parameters

inputTensor: Raw classification scores (logits), shape [batch_size] or [...]. Typically output of linear classifier without sigmoid/tanh applied. Larger x·y → smaller loss (more confident correct prediction).
targetTensor: Binary classification labels, shape [...] matching input. Values must be +1 (positive class) or -1 (negative class).
optionsSoftMarginLossFunctionalOptionsoptional

Returns

Tensor– Loss tensor, shape [] (scalar) if reduction='mean'|'sum', else [...]

Examples

// Binary classification: logits and ±1 labels
const batch_size = 32;
const logits = torch.randn([batch_size]);  // Raw classification scores
const labels = torch.ones([batch_size]);   // Binary labels: +1 or -1
labels.masked_fill_(torch.rand([batch_size]).lt(0.5), -1);  // Random labels
const loss = torch.nn.functional.soft_margin_loss(logits, labels);

// Logistic regression for binary classification
const X = torch.randn([100, 20]);  // Features [batch_size, feature_dim]
const W = torch.randn([20, 1]);    // Weights [feature_dim, 1]
const y = torch.ones([100]);       // Labels: +1 or -1
y.masked_fill_(torch.rand([100]).lt(0.3), -1);

const logits = X.matmul(W).squeeze(-1);  // [100]
const loss = torch.nn.functional.soft_margin_loss(logits, y);

// Per-sample loss for custom weighting by difficulty
const logits = torch.randn([32]);
const labels = torch.ones([32]);
labels.masked_fill_(torch.rand([32]).lt(0.5), -1);

const per_sample_loss = torch.nn.functional.soft_margin_loss(logits, labels, 'none');  // [32]
const loss_weights = per_sample_loss.gt(Math.log(2)).float().mul(2).add(1);  // Hard examples weighted higher
const weighted_loss = per_sample_loss.mul(loss_weights).mean();

torch.nn.functional.soft_margin_loss

Parameters

Returns

Examples

See Also

torch.nn.functional.soft_margin_loss

Parameters

Returns

Examples

See Also