CORRELATE
The CORRELATE function computes the linear Pearson correlation coefficient of two vectors or the correlation matrix of an m x n array. Alternatively, this function computes the unbiased sample covariance of two vectors or the covariance matrix of an m x n array.
This routine is written in the IDL language. Its source code can be found in the file correlate.pro
in the lib
subdirectory of the IDL distribution.
Tip: If you are computing covariance, you may want to use the RUNNING_COVARIANCE function instead, which avoids overflow for large values, is significantly faster, uses less memory, and also allows you to combine calculations for data sets that do not fit into memory.
Examples
Define the data vectors.
X = [65,63,67,64,68,62,70,66,68,67,69,71]
Y = [68,66,68,65,69,66,68,65,71,67,68,70]
Compute the linear Pearson correlation coefficient of x and y. The result should be 0.702652:
PRINT, CORRELATE(X, Y)
IDL prints:
0.702652
Compute the covariance of x and y. The result should be 3.66667.
PRINT, CORRELATE(X, Y, /COVARIANCE)
IDL prints:
3.66667
Define an array with x and y as its columns.
A = TRANSPOSE([[X],[Y]])
Compute the correlation matrix.
PRINT, CORRELATE(A)
IDL prints:
1.00000 0.702652
0.702652 1.00000
Syntax
Result = CORRELATE( X [, Y] [, /COVARIANCE] [, /DOUBLE] )
Return Value
If vectors of unequal lengths are specified, the longer vector is truncated to the length of the shorter vector and a single correlation coefficient is returned. If an m x n array is specified, the result will be an m x m array of linear Pearson correlation coefficients, with the element i,j corresponding to correlation of the ith and jth columns of the input array.
Arguments
X
A vector or an m x n array. X can be integer, single-, or double-precision floating-point.
Y
An integer, single-, or double-precision floating-point vector. If X is an m x n array, Y should not be supplied.
Keywords
COVARIANCE
Set this keyword to compute the sample covariance rather than the correlation coefficient.
Tip: If you are computing covariance, you may want to use the RUNNING_COVARIANCE function instead, which avoids overflow for large values, is significantly faster, uses less memory, and also allows you to combine calculations for data sets that do not fit into memory.
DOUBLE
Set this keyword to force the computation to be done in double-precision arithmetic.
Version History
Pre 4.0 |
Introduced |
Resources and References
J. Neter, W. Wasserman, G.A. Whitmore, Applied Statistics (Third Edition), Allyn and Bacon (ISBN 0-205-10328-6).
See Also
A_CORRELATE, C_CORRELATE, M_CORRELATE, P_CORRELATE, R_CORRELATE, RUNNING_COVARIANCE