rsqdelta | Compute R-square change and F-statistics in regression | rsqdelta |

York University

PROC REG includes the R-square statistic in the OUTEST dataset when the ADJRSQ option is specified on the MODEL statement. In the OUTSTAT= data set, PROC GLM includes the sequential sums of squares (SS1) that will provide the F statistic and associated p-value. However, PROC GLM does not include the R-square statistic in an output dataset.

By combining the information from PROC REG's OUTEST dataset and PROC GLM's OUTSTAT dataset and doing some DATA step programming, we will have a program that will compute the change in R-square as well as the F statistics and p-values as variables are added to a model. The F statistics and p-values in the final table represent a partial F statistic for the general linear model testing approach which is defined as follows:

[ SSE(R) - SSE(F) ] / [ df(R) - df(F) ] F(C) = -------------------------------------- SSE(F) / df(F)where SSE() is the Error Sum of Squares and df() is the error degrees of freedom for the full (F) or reduced (R) model. The rejection region is defined as

F(C) > F(alpha, df(R)-df(F), df(F))Note the F(C) statistic is a different than an overall F test which tests whether or not there is a regression relationship between the dependent variable and the set of independent variables. Most regression textbooks provide a discussion of tests about the regression coefficients.

The arguments may be listed within parentheses in any order, separated by commas. For example:

%rsqdelta(data=inputdataset, yvar=response, xvar=independentvars ..., )

- DATA=_LAST_
- The name of the input dataset. If not specified, the most recently created dataset is used.
- YVAR=
- The name of the response (dependent) variable.
- XVAR=
- A list of the independent variables. List the names of your independent varaibles in the order in which you want them included in the model. Variable list abbreviations (e.g., X1-X10 or FIRST--LAST) are NOT allowed.

%include macros(rsqdelta); *-- or include in an autocall library; %include data(fitness); %rsqdelta(data=fitness, yvar=oxy, xvar=runtime age runpulse maxpulse weight);

Change in R-square & F statistics as variables are added OBS _MODEL_ _RSQ_ RSQDELTA F PROB 1 INTERCEP 0.00000 . 2451.73 0.00000 2 RUNTIME 0.74338 0.74338 84.01 0.00000 3 AGE 0.76425 0.02087 2.48 0.12666 4 RUNPULSE 0.81109 0.04685 6.70 0.01537 5 MAXPULSE 0.83682 0.02572 4.10 0.05330 6 WEIGHT 0.84800 0.01118 1.84 0.18714

dummy Construct dummy variables for regression models

resline Resistant line for bivariate data