REGRESS
The REGRESS function performs a multiple linear regression fit and returns an Nterm-element column vector of coefficients.
REGRESS fits the function:
yi = const + a0x0, i + a1x1, i + ... + aNterms-1xNterms-1, i
This routine is written in the IDL language. Its source code can be found in the file regress.pro
in the lib
subdirectory of the IDL distribution.
Examples
; Create two vectors of independent variable data:
X1 = [1.0, 2.0, 4.0, 8.0, 16.0, 32.0]
X2 = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0]
; Combine into a 2x6 array
X = [TRANSPOSE(X1), TRANSPOSE(X2)]
; Create a vector of dependent variable data:
Y = 5 + 3*X1 - 4*X2
; Assume Gaussian measurement errors for each point:
measure_errors = REPLICATE(0.5, N_ELEMENTS(Y))
; Compute the fit, and print the results:
result = REGRESS(X, Y, SIGMA=sigma, CONST=const, $
MEASURE_ERRORS=measure_errors)
PRINT, 'Constant: ', const
PRINT, 'Coefficients: ', result[*]
PRINT, 'Standard errors: ', sigma
IDL prints:
Constant: 4.99999
Coefficients: 3.00000 -3.99999
Standard errors: 0.0444831 0.282038
Syntax
Result = REGRESS( X, Y, [, CHISQ=variable] [, CONST=variable] [, CORRELATION=variable] [, /DOUBLE] [, FTEST=variable] [, MCORRELATION=variable] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, STATUS=variable] [, YFIT=variable] )
Return Value
REGRESS returns a 1 x Nterm array of coefficients. If the DOUBLE keyword is set, or if X or Y are double-precision, then the result will be double precision, otherwise the result will be single precision.
Arguments
X
An Nterms by Npoints array of independent variable data, where Nterms is the number of coefficients (independent variables) and Npoints is the number of samples.
Y
An Npoints-element vector of dependent variable points.
Keywords
CHISQ
Set this keyword equal to a named variable that will contain the value of the unreduced chi-square goodness-of-fit statistic.
CONST
Set this keyword to a named variable that will contain the constant term of the fit.
CORRELATION
Set this keyword to a named variable that will contain the vector of linear correlation coefficients.
DOUBLE
Set this keyword to force computations to be done in double-precision arithmetic.
FTEST
Set this keyword to a named variable that will contain the F-value for the goodness-of-fit test.
MCORRELATION
Set this keyword to a named variable that will contain the multiple linear correlation coefficient.
MEASURE_ERRORS
Set this keyword to a vector containing standard measurement errors for each point Y[i]. This vector must be the same length as X and Y.
Note: For Gaussian errors (e.g., instrumental uncertainties), MEASURE_ERRORS should be set to the standard deviations of each point in Y. For Poisson or statistical weighting, MEASURE_ERRORS should be set to SQRT(Y).
SIGMA
Set this keyword to a named variable that will contain the 1-sigma uncertainty estimates for the returned parameters.
Note: If MEASURE_ERRORS is omitted, then you are assuming that the regression model is the correct model for your data, and therefore, no independent goodness-of-fit test is possible. In this case, the values returned in SIGMA are multiplied by SQRT(CHISQ/(N–M)), where N is the number of points in X, and M is the number of coefficients. See section 15.2 of Numerical Recipes in C (Second Edition) for details.
STATUS
Set this keyword to a named variable that will contain the status of the operation. Possible status values are:
- 0 = successful completion
- 1 = singular array (which indicates that the inversion is invalid)
- 2 = warning that a small pivot element was used and that significant accuracy was probably lost.
Note: If STATUS is not specified, any error messages will be output to the screen.
YFIT
Set this keyword to a named variable that will contain the vector of calculated Y values.
Version History
Original |
Introduced |
5.4 |
Deprecated the Weights, Yfit, Const, Sigma, Ftest, R, Rmul, Chisq, and Status arguments, RELATIVE_WEIGHT keyword. |