sprdplot Find power transformations to equalize variance sprdplot

SAS Macro Programs: sprdplot

$Version: 1.2-1 (27 Feb 2013)
Michael Friendly
York University

The sprdplot macro ( [download] get sprdplot.sas)

Find power transformations to equalize variance

The sprdplot macro produces a spread-level plot to determine if a simple power transformation can equalize within-group variance of a response variable in a dataset classified by one or more classification variables.

The spread-level plot has the property that *if* the relationship between log10(Interquartile range) and log10(Median) is reasonably linear, then the recommended power is p = 1 - slope, and the transformation is

           / y**p,      p > 0
     y --> | log(p),    p = 0
           \ -100y**p,  p < 0
The macro chooses the best power(s) from a list of simple integers and half-integers (PLIST=), and creates new variables using those transformations.


The power is determined from the slope of a weighted linear regression of log10(IQR) on log10(Median), using sample sizes as weights.


The SPRDPLOT macro is defined with 11 keyword parameters. The VAR= and CLASS= parameters are required. The arguments may be listed within parentheses in any order, separated by commas. For example:
  %sprdplot(data=animals, var=survive, class=treat poison);


Default values are shown after the name of each parameter.
Name of the input dataset [Default: DATA=_LAST_]
[R] Names of one or more class variables. Only the first CLASS= variable is used as point labels in the graphics plot.
[R] Name of the variable to be transformed. Must be numeric, and should contain all positive values.
Constant added to the VAR= variable before transformation. If the variable contains negative values, OFFSET is set equal to the abs(minimum) value, to ensure that all values are positive.
Prefix for name of transformed variable. If the PREFIX is T_ and BEST=1, the transformed variable is named T_&var. If BEST>1, the variables are named T_1&var, T_2&var, ... [Default: PREFIX=T_]
List of powers to consider. Should be a blank-separated list of numbers in increasing order. [Default: PLIST=-3 -2 -1 -.5 0 .5 1 2 3]
Number of best powers to transform &var [Default: BEST=1]
Produce a printer plot? [Default: PPLOT=N]
Produce a graphics plot? [Default: GPLOT=Y]
Height of text in graphics plot [Default: HTEXT=1.7]
Name of the output dataset [Default: OUT=&DATA]


The data give survival times (in 10 hour units) of animals exposed to one of 3 types of poison and given one of 4 treatments, in a (3 x 4) design, with 4 replications. Box and Cox (1964) showed that a reciprocal transformation is reasonable.
%include macros(sprdplot);        *-- or include in an autocall library;

title 'Survival times of animals';
* Hand etal #403, from Box & Cox;
data animals;
   do poison=1 to 3;
      do rep = 1 to 4;
         do treatmt='A', 'B', 'C', 'D';
            input time @;
            time = time*10;
   label treatmt='Treatment' time='Survival time (hrs)';
0.31  0.82  0.43  0.45  0.45  1.10  0.45  0.71
0.46  0.88  0.63  0.66  0.43  0.72  0.76  0.62
0.36  0.92  0.44  0.56  0.29  0.61  0.35  1.02
0.40  0.49  0.31  0.71  0.23  1.24  0.40  0.38
0.22  0.30  0.23  0.30  0.21  0.37  0.25  0.36
0.18  0.38  0.24  0.31  0.23  0.29  0.22  0.33
*-- Check for variance dependent on mean;
%sprdplot(data=animals, class=poison treatmt, var=time);
This produces the graph, chooses p= -1 (i.e., -100/Time), which is saved in the variable T_TIME.
*-- Analyze the transformed response (T_TIME = -1/Time);
proc glm data=animals;
   class poison treatmt;
   model t_time = poison | treatmt;

See also