torch.distributions.Multinomial
class Multinomial extends Distributionnew Multinomial(options: {
total_count: number;
probs?: number[] | Tensor;
logits?: number[] | Tensor;
} & DistributionOptions)
- readonly
total_count(number) - – Number of trials.
- readonly
arg_constraints(unknown) - readonly
support(unknown) - readonly
probs(Tensor) - readonly
logits(Tensor) - readonly
mean(Tensor) - readonly
variance(Tensor)
Multinomial distribution: generalization of binomial to K categories, models counts per category.
Parameterized by total_count (n trials) and probabilities p₁, ..., pₖ for K categories where ∑pᵢ = 1. Each sample is a K-dimensional vector of non-negative integer counts [x₁, ..., xₖ] where ∑xᵢ = n. Think of it as: if you perform n independent categorical trials, how many times does each category occur? Essential for:
- Multi-category count data and multinomial regression
- Language models and word/token count distributions
- Mixture model component allocation and clustering
- Contingency table analysis and goodness-of-fit tests
- Survival analysis with competing risks
- Sampling from mixture components
- Bayesian analysis with Dirichlet priors
- Network analysis and degree distributions
Generalization Chain: Binomial is Multinomial with K=2 categories; Categorical draws single category; Multinomial draws counts across multiple categories in n trials.
Constraint: Samples always satisfy ∑xᵢ = n (total_count). This creates dependencies between categories (negative covariance): allocating more samples to one category means fewer for others.
- Multinomial is Binomial generalized: K=2 is equivalent to Binomial
- Marginal distributions: Each X_i ~ Binomial(n, p_i) independently
- Negative covariance: Categories compete for samples; if one goes up, others go down
- Sum constraint: ∑ Xᵢ = n always; samples are not independent
- Probs must sum to 1: Will be automatically normalized; logits avoid this requirement
- Total count matters: E[X_i] scales linearly with total_count
- Entropy depends on probs: Maximum when all probs equal (1/K each)
- Huge variance with small p: If p_i is small, variance np_i(1-p_i) ≈ np_i (can be large)
- Mutual exclusive: Exactly one of probs or logits must be specified
- Dependency structure: Samples are highly dependent (sum constraint), not independent
- Large K: With many categories and large total_count, some counts may be 0
Examples
// Fair 4-sided die: roll 10 times, count outcomes
const fair_die = new torch.distributions.Multinomial({
total_count: 10,
probs: torch.tensor([0.25, 0.25, 0.25, 0.25]) // equal probability for each face
});
const counts = fair_die.sample(); // [x₁, x₂, x₃, x₄] where x₁+x₂+x₃+x₄ = 10
// Typical sample: [3, 2, 2, 3] or [2, 1, 4, 3] etc.
// Biased die: faces 1,2 more likely than 3,4
const biased_die = new torch.distributions.Multinomial({
total_count: 10,
probs: torch.tensor([0.4, 0.4, 0.1, 0.1])
});
const biased_counts = biased_die.sample(); // faces 1,2 usually more frequent
// Language model: sample word counts in a 100-word sequence
// From vocabulary of 1000 words, each with learned probabilities
const vocab_size = 1000;
const word_probs = torch.rand([vocab_size]);
const word_probs_normalized = word_probs.div(word_probs.sum());
const text_dist = new torch.distributions.Multinomial({
total_count: 100, // 100-word document
probs: word_probs_normalized
});
const word_counts = text_dist.sample(); // [word_count₁, ..., word_count₁₀₀₀], sum = 100
// Opinion poll: 1000 respondents, 3 choices (agree/neutral/disagree)
const num_respondents = 1000;
const opinion_probs = torch.tensor([0.4, 0.3, 0.3]); // 40% agree, 30% each other\n * const poll = new torch.distributions.Multinomial({
total_count: num_respondents,
probs: opinion_probs
});
const opinion_counts = poll.sample(); // [400, 300, 300] approximately
const expected = poll.mean; // exactly [400, 300, 300]\n *
// Batched sampling: multiple independent experiments
const batch_probs = torch.tensor([[0.5, 0.3, 0.2], [0.2, 0.3, 0.5]]); // [2, 3]
const batch_dist = new torch.distributions.Multinomial({
total_count: 100,
probs: batch_probs\n * });
const batch_samples = batch_dist.sample(); // [2, 3] shape (two 3-category samples)
// Generate from mixture model: allocate 100 samples among 5 mixture components
const num_samples = 100;
const mixture_weights = torch.tensor([0.3, 0.25, 0.2, 0.15, 0.1]);
const mixture = new torch.distributions.Multinomial({
total_count: num_samples,
probs: mixture_weights
});
const component_counts = mixture.sample(); // how many samples from each component