REGRESS

The REGRESS function performs a multiple linear regression fit and returns an Nterm-element column vector of coefficients.

REGRESS fits the function:

yi = const + a0x0, i + a1x1, i + ... + aNterms-1xNterms-1, i

This routine is written in the IDL language. Its source code can be found in the file regress.pro in the lib subdirectory of the IDL distribution.

Examples

; Create two vectors of independent variable data:

X1 = [1.0, 2.0, 4.0, 8.0, 16.0, 32.0]

X2 = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0]

; Combine into a 2x6 array

X = [TRANSPOSE(X1), TRANSPOSE(X2)]



; Create a vector of dependent variable data:

Y = 5 + 3*X1 - 4*X2



; Assume Gaussian measurement errors for each point:

measure_errors = REPLICATE(0.5, N_ELEMENTS(Y))



; Compute the fit, and print the results:

result = REGRESS(X, Y, SIGMA=sigma, CONST=const, $

   MEASURE_ERRORS=measure_errors)

PRINT, 'Constant: ', const

PRINT, 'Coefficients: ', result[*]

PRINT, 'Standard errors: ', sigma

IDL prints:

Constant:    4.99999

Coefficients:    3.00000    -3.99999

Standard errors:    0.0444831    0.282038

Syntax

Result = REGRESS( X, Y, [, CHISQ=variable] [, CONST=variable] [, CORRELATION=variable] [, /DOUBLE] [, FTEST=variable] [, MCORRELATION=variable] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, STATUS=variable] [, YFIT=variable] )

Return Value

REGRESS returns a 1 x Nterm array of coefficients. If the DOUBLE keyword is set, or if X or Y are double-precision, then the result will be double precision, otherwise the result will be single precision.

Arguments

X

An Nterms by Npoints array of independent variable data, where Nterms is the number of coefficients (independent variables) and Npoints is the number of samples.

Y

An Npoints-element vector of dependent variable points.

Keywords

CHISQ

Set this keyword equal to a named variable that will contain the value of the unreduced chi-square goodness-of-fit statistic.

CONST

Set this keyword to a named variable that will contain the constant term of the fit.

CORRELATION

Set this keyword to a named variable that will contain the vector of linear correlation coefficients.

DOUBLE

Set this keyword to force computations to be done in double-precision arithmetic.

FTEST

Set this keyword to a named variable that will contain the F-value for the goodness-of-fit test.

MCORRELATION

Set this keyword to a named variable that will contain the multiple linear correlation coefficient.

MEASURE_ERRORS

Set this keyword to a vector containing standard measurement errors for each point Y[i]. This vector must be the same length as X and Y.

Note: For Gaussian errors (e.g., instrumental uncertainties), MEASURE_ERRORS should be set to the standard deviations of each point in Y. For Poisson or statistical weighting, MEASURE_ERRORS should be set to SQRT(Y).

SIGMA

Set this keyword to a named variable that will contain the 1-sigma uncertainty estimates for the returned parameters.

Note: If MEASURE_ERRORS is omitted, then you are assuming that the regression model is the correct model for your data, and therefore, no independent goodness-of-fit test is possible. In this case, the values returned in SIGMA are multiplied by SQRT(CHISQ/(NM)), where N is the number of points in X, and M is the number of coefficients. See section 15.2 of Numerical Recipes in C (Second Edition) for details.

STATUS

Set this keyword to a named variable that will contain the status of the operation. Possible status values are:

Note: If STATUS is not specified, any error messages will be output to the screen.

YFIT

Set this keyword to a named variable that will contain the vector of calculated Y values.

Version History

Original

Introduced

5.4

Deprecated the Weights, Yfit, Const, Sigma, Ftest, R, Rmul, Chisq, and Status arguments, RELATIVE_WEIGHT keyword.

See Also

CURVEFIT, GAUSSFIT, LINFIT, LMFIT, POLY_FIT, SFIT, SVDFIT