torch.distributions.HalfCauchy
class HalfCauchy extends Distributionnew HalfCauchy(scale: number | Tensor, options?: DistributionOptions)
- readonly
scale(Tensor) - – Scale parameter of the underlying Cauchy distribution.
- readonly
arg_constraints(unknown) - readonly
support(unknown) - readonly
has_rsample(unknown) - readonly
mean(Tensor) - readonly
mode(Tensor) - readonly
variance(Tensor)
Half-Cauchy distribution: folded Cauchy with extreme tail behavior for positive values.
Parameterized by scale γ. The half-Cauchy is obtained by taking the absolute value of a Cauchy(0, γ) distribution: if X = |Y| where Y ~ Cauchy(0, γ), then X ~ HalfCauchy(γ). Support is (0, ∞). Heavier-tailed than HalfNormal; standard prior for scale parameters in robust Bayesian analysis. Essential for:
- Bayesian hierarchical models (weakly informative priors for scale/variance)
- Heavy-tailed prior for scale parameters in robust inference
- Prior for variance components in mixed-effects models
- Robust Bayesian regression (accommodates outliers)
- Mixture models with extreme value components
- Prior for random effects variance in hierarchical structures
- Stan default prior recommendation for scale parameters
Why Heavy Tails: Cauchy has no defined mean/variance. Half-Cauchy preserves this property, assigning non-negligible probability to very large scale values. This "doesn't constrain the scale" in a sense, allowing data to dictate the scale estimate more freely.
HalfNormal vs HalfCauchy: HalfNormal has tails ~exp(-x²), HalfCauchy has tails ~1/x². For priors: HalfCauchy is more weakly informative, HalfNormal is more conservative/skeptical.
- No mean or variance: Like Cauchy, moments don't exist (heavy tails)
- Heavier than HalfNormal: Polynomial tail decay (1/x²) vs exponential
- Weakly informative: Often called "automatic" or "data-driven" prior (minimal input)
- Stan recommendation: Default prior for scale parameters in Stan modeling language
- Tail behavior: P(X x) ~ 2/(π(x/γ)²) for large x
- Folded Cauchy: Half-Cauchy = |Cauchy(0, γ)|
- CDF has simple form: F(x) = (2/π) arctan(x/γ) is easy to compute
- No mean or variance: Methods assuming finite moments will fail
- Extreme tail probability: Samples can easily exceed 10x the scale parameter
- Prior specification: Large scale values give very weak prior (be careful of default scales)
- Numerical issues: Very large samples can cause log_prob overflow/underflow
Examples
// Standard half-Cauchy: scale=1
const hc = new torch.distributions.HalfCauchy(1);
const samples = hc.sample([1000]); // 1000 positive samples with heavy right tail
// Typical values: 0.5-5, but can easily exceed 20
// Bayesian prior for variance component in hierarchical model
// Prior for group-level variance in mixed-effects regression
const group_var_prior = new torch.distributions.HalfCauchy(1);
const group_scale = group_var_prior.sample([num_groups]); // prior for each group
// Then: y_i ~ Normal(mean_i, group_scale[group_i])
// Heavy-tailed prior for regression coefficient prior
// More robust to outliers than normal prior
const robust_prior = new torch.distributions.HalfCauchy(2.5); // Stan default recommendation
const prior_samples = robust_prior.sample([10000]);
// Allows some parameters to be very large without extreme penalty
// Batched distributions with different scales
const scales = torch.tensor([0.5, 1.0, 2.0, 5.0]);
const dist = new torch.distributions.HalfCauchy(scales); // [4] batch shape
const samples = dist.sample(); // [4] shaped samples
// Larger scale → wider, heavier tail
// CDF and quantile functions
const hc = new torch.distributions.HalfCauchy(1);
const cdf_values = hc.cdf(torch.tensor([0.5, 1.0, 2.0]));
const q95 = hc.icdf(torch.tensor([0.95])); // 95% quantile (quite large)
const q99 = hc.icdf(torch.tensor([0.99])); // 99% quantile (very large)
// Comparing to HalfNormal: tail behavior
const hn = new torch.distributions.HalfNormal(1);
const hc = new torch.distributions.HalfCauchy(1);
const x_extreme = torch.tensor([10]);
const hn_prob = hn.log_prob(x_extreme); // exp(-50) (virtually impossible)
const hc_prob = hc.log_prob(x_extreme); // 1/101 (quite possible)
// HalfCauchy assigns vastly more probability to extreme values