torch.distributions.Bernoulli
class Bernoulli extends Distributionnew Bernoulli(options: { probs?: number | Tensor; logits?: number | Tensor } & DistributionOptions)
- readonly
arg_constraints(unknown) - readonly
support(unknown) - readonly
has_enumerate_support(unknown) - readonly
probs(Tensor) - – Get probs, computing from logits if needed.
- readonly
logits(Tensor) - – Get logits, computing from probs if needed.
- readonly
param_shape(readonly number[]) - – Shape of the parameter tensor.
- readonly
mean(Tensor) - readonly
mode(Tensor) - readonly
variance(Tensor)
Bernoulli distribution: single binary trial with probability p.
The fundamental discrete probability distribution for modeling binary outcomes (yes/no, success/failure). Each sample is either 0 or 1. Essential for:
- Binary classification (logistic regression output)
- Stochastic binary masks (variational dropout, DropOut)
- Coin flips and success/failure modeling
- Probability modeling for binary variables
- Variational inference with discrete variables
- Neural network probabilistic outputs
The probability mass function: P(X = 1) = p, P(X = 0) = 1 - p
- Probs vs Logits: Use logits for numerical stability (no clipping needed)
- Binary output: Samples are exactly 0 or 1, never intermediate values
- Limit case: Bernoulli(p) is Binomial(1, p) with n=1
- Maximum entropy: Maximum uncertainty at p=0.5 (fair coin)
- Symmetric around 0.5: entropy(p) = entropy(1-p)
- Variance decreasing: Maximum variance at p=0.5
- Log-likelihood: log_prob(1) = log(p), log_prob(0) = log(1-p)
- Probability bounds: p must be in [0, 1]. Values outside cause errors
- Extreme probabilities: p near 0 or 1 have low entropy (less exploration)
- Gradient issues: Gradients can be zero for determinate distributions
Examples
// Fair coin flip: 50/50 chance of heads/tails
const coin = new torch.distributions.Bernoulli({ probs: 0.5 });
const flip = coin.sample(); // 0 (tails) or 1 (heads)
const flips = coin.sample([1000]); // 1000 coin flips
// Biased coin: 30% heads, 70% tails
const biased = new torch.distributions.Bernoulli({ probs: 0.3 });
const sample = biased.sample(); // mostly 0s, fewer 1s
// Binary classification with learned probabilities
const logits = model(x); // model output
const pred_dist = new torch.distributions.Bernoulli({ logits });
const prediction = pred_dist.sample(); // 0 or 1 (class label)
const log_prob = pred_dist.log_prob(target); // negative log-likelihood
// Stochastic binary mask for dropout
const feature_drop_rate = 0.5;
const mask_dist = new torch.distributions.Bernoulli({ probs: 1 - feature_drop_rate });
const mask = mask_dist.sample([batch_size, feature_dim]); // shape [batch, features]
const masked_features = features.mul(mask); // drop features randomly
// Batched distributions with different probabilities
const probs = torch.tensor([0.1, 0.5, 0.9]); // 3 different probabilities
const dist = new torch.distributions.Bernoulli({ probs });
const samples = dist.sample(); // [3] shaped samples
// First is mostly 0 (p=0.1), middle is balanced (p=0.5), last is mostly 1 (p=0.9)
// Entropy: measure of uncertainty
const certain = new torch.distributions.Bernoulli({ probs: 0.99 });
const entropy_low = certain.entropy(); // close to 0 (very certain)
const fair = new torch.distributions.Bernoulli({ probs: 0.5 });
const entropy_max = fair.entropy(); // log(2) ≈ 0.693 (maximum uncertainty)