torch.corrcoef

function corrcoef<D extends DType = DType, Dev extends DeviceType = DeviceType>(input: Tensor<Shape, D, Dev>): Tensor<Shape, D, Dev>

Computes the Pearson correlation coefficient matrix between variables.

Estimates the linear correlation between each pair of variables in a dataset. The output is a symmetric matrix where entry [i, j] is the correlation between variable i and variable j. Diagonal elements are always 1 (perfect correlation with self). Essential for:

Feature analysis: Understanding relationships between variables
Data exploration: Identifying which features are correlated
Multicollinearity detection: Finding redundant features in datasets
Causality analysis: Screening for potential causal relationships
Quality control: Detecting measurement errors or instrument drift
Portfolio analysis: Measuring asset correlations for diversification

Input shape: either 1D [num_observations] or 2D [num_variables, num_observations]. For 1D input, treats it as a single variable. Correlation values range from -1 (perfect negative) to +1 (perfect positive), with 0 indicating no linear relationship.

\begin{aligned} \\rho_{X,Y} = \\frac{\\text{cov}(X, Y)}{\\sigma_X \\sigma_Y} = \\frac{\\mathbb{E}[(X - \\mu_X)(Y - \\mu_Y)]}{\\sigma_X \\sigma_Y} \end{aligned}

Shape requirement: Input must be 1D or 2D (3D and higher raise error)
Observation axis: Each column is an observation; each row is a variable
Symmetric output: Result is always symmetric: corr[i,j] = corr[j,i]
Diagonal is 1: Diagonal elements are always 1.0 (perfect self-correlation)
NaN handling: Constant variables (zero standard deviation) produce NaN correlations
Bessel correction: Uses denominator (n-1) for unbiased covariance estimate

Pearson correlation: Only captures linear relationships, not nonlinear ones
Outlier sensitivity: Correlation is sensitive to outliers; consider robust alternatives
Sample size: Needs sufficient samples; with n2 returns NaN

Parameters

inputTensor<Shape, D, Dev>: 1D tensor [num_observations] or 2D tensor [num_variables, num_observations]. Each row is a variable, each column is an observation.

Returns

Tensor<Shape, D, Dev>– Correlation matrix of shape [num_variables, num_variables] where entry [i, j] is the Pearson correlation between variables i and j

Examples

// Two variables with 3 observations
const x = torch.tensor([[0, 2], [1, 1], [2, 0]]).t();  // Shape: [2, 3]
const corr = torch.corrcoef(x);
// [[1.0, -1.0],      // var0 with var0 and var1
//  [-1.0, 1.0]]      // var1 with var0 and var1 (perfect negative correlation)

// Single variable (1D input)
const y = torch.tensor([1, 2, 3, 4, 5]);
const corr1d = torch.corrcoef(y);  // Shape: [1, 1], value: [[1.0]]

// Multi-variable dataset
const data = torch.randn(100, 5);  // 100 samples, 5 features
const correlations = torch.corrcoef(data.t());  // [5, 5] correlation matrix

// Find highly correlated features
const prices = torch.tensor([
  [100, 102, 105, 103],  // Stock A
  [50, 51, 53, 52],      // Stock B
  [30, 35, 32, 38]       // Stock C
]);
const corrMatrix = torch.corrcoef(prices);
// Diagonal [A-A, B-B, C-C] = all 1.0
// Off-diagonals show correlations between different stocks

torch.corrcoef

Parameters

Returns

Examples

See Also

torch.corrcoef

Parameters

Returns

Examples

See Also