torch.corrcoef
function corrcoef<D extends DType = DType, Dev extends DeviceType = DeviceType>(input: Tensor<Shape, D, Dev>): Tensor<Shape, D, Dev>Computes the Pearson correlation coefficient matrix between variables.
Estimates the linear correlation between each pair of variables in a dataset. The output is a symmetric matrix where entry [i, j] is the correlation between variable i and variable j. Diagonal elements are always 1 (perfect correlation with self). Essential for:
- Feature analysis: Understanding relationships between variables
- Data exploration: Identifying which features are correlated
- Multicollinearity detection: Finding redundant features in datasets
- Causality analysis: Screening for potential causal relationships
- Quality control: Detecting measurement errors or instrument drift
- Portfolio analysis: Measuring asset correlations for diversification
Input shape: either 1D [num_observations] or 2D [num_variables, num_observations]. For 1D input, treats it as a single variable. Correlation values range from -1 (perfect negative) to +1 (perfect positive), with 0 indicating no linear relationship.
- Shape requirement: Input must be 1D or 2D (3D and higher raise error)
- Observation axis: Each column is an observation; each row is a variable
- Symmetric output: Result is always symmetric: corr[i,j] = corr[j,i]
- Diagonal is 1: Diagonal elements are always 1.0 (perfect self-correlation)
- NaN handling: Constant variables (zero standard deviation) produce NaN correlations
- Bessel correction: Uses denominator (n-1) for unbiased covariance estimate
- Pearson correlation: Only captures linear relationships, not nonlinear ones
- Outlier sensitivity: Correlation is sensitive to outliers; consider robust alternatives
- Sample size: Needs sufficient samples; with n2 returns NaN
Parameters
Returns
Tensor<Shape, D, Dev>– Correlation matrix of shape [num_variables, num_variables] where entry [i, j] is the Pearson correlation between variables i and jExamples
// Two variables with 3 observations
const x = torch.tensor([[0, 2], [1, 1], [2, 0]]).t(); // Shape: [2, 3]
const corr = torch.corrcoef(x);
// [[1.0, -1.0], // var0 with var0 and var1
// [-1.0, 1.0]] // var1 with var0 and var1 (perfect negative correlation)
// Single variable (1D input)
const y = torch.tensor([1, 2, 3, 4, 5]);
const corr1d = torch.corrcoef(y); // Shape: [1, 1], value: [[1.0]]
// Multi-variable dataset
const data = torch.randn(100, 5); // 100 samples, 5 features
const correlations = torch.corrcoef(data.t()); // [5, 5] correlation matrix
// Find highly correlated features
const prices = torch.tensor([
[100, 102, 105, 103], // Stock A
[50, 51, 53, 52], // Stock B
[30, 35, 32, 38] // Stock C
]);
const corrMatrix = torch.corrcoef(prices);
// Diagonal [A-A, B-B, C-C] = all 1.0
// Off-diagonals show correlations between different stocksSee Also
- PyTorch torch.corrcoef()
- cov - Compute covariance matrix (numerator of correlation formula)
- std - Compute standard deviation (denominator of correlation formula)
- mean - Compute mean (needed for correlation computation)
- var - Compute variance