torch.distributions.InverseGamma
class InverseGamma extends Distributionnew InverseGamma(concentration: number | Tensor, rate: number | Tensor, options?: DistributionOptions)
- readonly
concentration(Tensor) - – Concentration parameter (alpha, shape parameter).
- readonly
rate(Tensor) - – Rate parameter (beta, inverse scale).
- readonly
arg_constraints(unknown) - readonly
support(unknown) - readonly
has_rsample(unknown) - readonly
mean(Tensor) - readonly
mode(Tensor) - readonly
variance(Tensor)
Inverse Gamma distribution: conjugate prior for variance in normal models.
Parameterized by concentration α and rate β. If X ~ Gamma(α, β), then 1/X ~ InverseGamma(α, β). Support is (0, ∞). Critical for Bayesian statistics because it's the conjugate prior for the variance of normal distributions - posterior is also InverseGamma with updated parameters. Essential for:
- Bayesian inference for variance and precision parameters (conjugate prior)
- Prior for normal distribution variance (standard choice in Bayesian regression)
- Prior for exponential distribution scale parameters
- Hierarchical Bayesian models with variance components
- Modeling reciprocals of positive quantities
- Empirical Bayes and hyperparameter estimation
- Mixed-effects models with random effect variances
Conjugate Prior Property: If data X₁,...,Xₙ ~ N(μ, σ²) with known μ, and prior σ² ~ InverseGamma(α, β), then posterior σ² | data ~ InverseGamma(α + n/2, β + SS/2) where SS is sum of squared deviations. The parameters update in closed form (Bayesian advantage).
Relationship to Gamma: InverseGamma(α, β) = 1/Gamma(α, β). This reciprocal relationship means sampling is done by sampling Gamma and inverting.
- Conjugate for normal variance: Posterior also InverseGamma (closed-form update)
- Mean undefined if α ≤ 1: Mean only exists for α 1
- Variance undefined if α ≤ 2: Variance only exists for α 2
- Mode decreases with α: Mode = β/(α+1), lower α gives higher mode
- Reciprocal of Gamma: Sampling via Gamma reciprocal
- Right-skewed: Longer right tail (heavier for small α)
- Heavy tails: Assign probability to very large values (useful as prior for variance)
- α, β must be positive: α ≤ 0 or β ≤ 0 causes errors
- Mean undefined for α ≤ 1: Only exists when α 1
- Variance undefined for α ≤ 2: Variance doesn't exist unless α 2
- Weakly-informative: Very small α (e.g., 0.001) gives minimal constraints
Examples
// Simple inverse gamma: α=2, β=1
const ig = new torch.distributions.InverseGamma(2, 1);
const samples = ig.sample([1000]);
const mean = ig.mean; // 1.0
const variance = ig.variance; // 1.0
// Bayesian inference: conjugate prior for normal variance
// Prior: σ² ~ InverseGamma(α₀, β₀)
const alpha0 = 2;
const beta0 = 1;
const prior = new torch.distributions.InverseGamma(alpha0, beta0);
// Observe data and update posterior
const n = 50; // sample size
const ss = 25; // sum of squared deviations
const alpha_post = alpha0 + n / 2; // α₀ + n/2
const beta_post = beta0 + ss / 2; // β₀ + SS/2
const posterior = new torch.distributions.InverseGamma(alpha_post, beta_post);
// Posterior automatically updated with closed form!\n *
// Prior for regression variance in linear models
// Standard weakly-informative prior: InverseGamma(0.001, 0.001)
const weak_prior = new torch.distributions.InverseGamma(0.001, 0.001);
const weak_samples = weak_prior.sample([100]);
// Very dispersed prior, lets data dominate\n *
// Batched distributions with different parameters
const alphas = torch.tensor([0.5, 1.0, 2.0, 5.0]);
const betas = torch.tensor([1.0, 1.0, 1.0, 1.0]);
const dist = new torch.distributions.InverseGamma(alphas, betas);
const samples = dist.sample(); // [4] shaped samples
// Smaller α → heavier right tail
n *
// Comparing tail behavior with different concentration parameters
const light = new torch.distributions.InverseGamma(5, 1); // light tail
const heavy = new torch.distributions.InverseGamma(0.5, 1); // heavy tail
const x = torch.tensor([10]);
const light_prob = light.log_prob(x); // exp(-100) region
const heavy_prob = heavy.log_prob(x); // much larger probability
// Posterior for hierarchical variance component
// Multi-level model: within-group variance hierarchical prior
const hier_prior = new torch.distributions.InverseGamma(3, 2);
const hier_var = hier_prior.sample([10]); // 10 group-level variances