torch.special.xlog1py
function xlog1py<S1 extends Shape, S2 extends Shape>(input: Tensor<S1, 'float32'>, other: Tensor<S2, 'float32'>, _options?: SpecialBinaryOptions): Tensor<DynamicShape, 'float32'>Computes x * log1p(y) with safe handling for x=0 and numerical edge cases.
The function computes x * log(1 + y) but handles the case when x=0 specially: if x=0, returns 0 regardless of y's value (even if y=NaN or y=-1 which would make log1p undefined). This is crucial for probability computations where products of probabilities or likelihoods involve x*log(p) terms, and zero probabilities should naturally result in zero contribution. Essential for:
- Probabilistic modeling: negative log-likelihood (NLL) when mixing zero/nonzero probabilities
- Information theory: entropy and KL divergence with zero probability handling
- Machine learning: classification loss functions with soft targets
- Statistical inference: log-likelihood with mixture models
- Numerical stability: log1p handles small |y| better than log(1+y)
- Boundary handling: prevents NaN propagation when x=0
Key Properties:
- When x=0: result = 0 (regardless of y, even if y is NaN)
- When y=0: result = x * 0 = 0
- When y=-1 (makes log1p(y)=-∞ normally): x*(-∞) = 0 if x=0, undefined if x≠0
- Numerically stable: uses log1p(y) = log(1+y) for small |y|
- Equivalent to: x * ln(1 + y) for general x, y
- Used implicitly in many entropy/divergence calculations
- Zero handling: When x=0, always returns 0 (not NaN)
- Numerical stability: log1p(y) handles small |y| accurately
- Entropy-like: Appears in entropy, KL divergence, mutual information
- Probability domain: y typically in (-1, ∞) for log1p validity
- Zero weighting: Zero probabilities naturally eliminate their terms
- Boundary safe: Prevents NaN propagation at x=0
- Equivalent: xlog1py(x,y) = x*log(1+y) except for x=0 edge case
- Domain x=0: Returns 0 regardless of y (special case)
- Domain y=-1: log1p(y)=-∞ but xlog1py(x,-1)=0 if x=0
- Domain y-1: log1p(y) undefined, result is NaN
- Numerical precision: Very small y needs log1p, not log(1+y)
Parameters
inputTensor<S1, 'float32'>- Input tensor x (coefficient, typically probability or likelihood weight)
otherTensor<S2, 'float32'>- Input tensor y (argument, typically in (-1, ∞) for log1p domain)
_optionsSpecialBinaryOptionsoptional
Returns
Tensor<DynamicShape, 'float32'>– Tensor with x * log1p(y) values, with x=0 → 0Examples
// Safe probability weighting: x * log(1 + y)
const x = torch.tensor([0, 0.5, 1, 2]); // Weights
const y = torch.tensor([0, 0.1, 1, 10]); // Arguments
const result = torch.special.xlog1py(x, y); // Safe computation
// x=0 always gives 0, even if y has issues// KL divergence computation: entropy-like terms
const p = torch.tensor([0.1, 0.2, 0, 0.5]); // True probabilities (some zero!)
const q = torch.tensor([0.15, 0.15, 0.1, 0.6]); // Predicted probabilities
// KL divergence: ∑ p * log(p/q) = ∑ p * (log(p) - log(q))
// Terms with p=0 naturally contribute 0 (not NaN!)
const divergence = torch.special.xlog1py(p, q.div(p) - 1).sum();// Cross-entropy loss: handling zero probabilities
const true_labels = torch.tensor([0, 1, 0, 1]); // Binary labels
const pred_prob = torch.tensor([0.1, 0.95, 0.05, 0.92]); // Model predictions
// Cross-entropy: -∑ [y*log(p) + (1-y)*log(1-p)]
// When y=0: (1-0)*log(1-p) = log(1-p), computed safely
const part1 = torch.special.xlog1py(
1 - true_labels,
-pred_prob.div(1 - pred_prob) // log(1-p) = log(p) - log(1/p)
);// Numerical stability demonstration
const small_y = torch.tensor([-1e-6, -1e-8, -1e-10]); // Near zero
const x = torch.tensor([1, 2, 3]);
// Direct x * log(1 + y) would lose precision
// xlog1py uses log1p for accurate computation
const result = torch.special.xlog1py(x, small_y);See Also
- PyTorch torch.special.xlog1py()
- torch.special.xlogy - Similar: x*log(y) with safe x=0
- torch.special.entr - Shannon entropy: -x*log(x)
- torch.special.log1p - log(1+y) numerically stable form