lags 
Macro for lag sequential analysis 
lags 
Visualizing Categorical Data: lags
$Version: 1.2 (2 Feb 2001)
Michael Friendly
York University
Macro for lag sequential analysis
Given a variable containing event codes (char or numeric), the LAGS macro
creates:

a dataset containing n+1 lagged variables, _lag0  _lagN (_lag0 is just a
copy of the input event variable)

optionally, an (n+1)way contingency table containing frequencies of all
combinations of events at lag0  lagN
Either or both of these datasets may be used for subsequent analysis of
sequential dependencies. One or more BY= variables may be specified, in which case separate lags and frequencies are
produced for each value of the BY variables.
A WEIGHT= variable may also be specified,
giving frequencies weighted by that variable. For example, using the
duration of an event as a weight gives 'frequencies' which represent
the total number of time units for state sequential data.
One event variable must be specified with the VAR= option. All other options have default values. If one or more BY= variables are specified, lags and frequencies are calculated separately for
each combination of values of the BY= variable(s).
The arguments may be listed within parentheses in any order, separated by
commas. For example:
%lags(data=codes, var=event, nlag=2)
 DATA=

The name of the SAS dataset to be lagged. If DATA= is not specified, the most recently created data set is used.
 VAR=

The name of the event variable to be lagged. The variable may be either
character or numeric.
 BY=

The name of one or more BY variables. Lags will be restarted for each level
of the BY variable(s). The BY variables may be character or
numeric.
 WEIGHT=

Specifies a numeric variable whose value represents the
frequency or weight of an observation. The weight values
must be nonnegative, but need not be integers.
 VARFMT=

An optional format for the event VAR= variable. If the codes are numeric, and a format specifying what each
number means is used (e.g., 1='Active' 2='Passive'), the output lag
variables will be given the character values.
 NLAG=

Number of lags to compute. Default = 1.
 OUTLAG=

Name of the output dataset containing the lagged variables. This dataset
contains the original variables plus the lagged variables, named according
to the PREFIX= option.
 PREFIX=

Prefix for the name of the created lag variables. The default is
PREFIX=_LAG
, so the variables created are named _LAG1, _LAG2, ..., up to
_LAG&nlag. For convenience, a copy of the event variable is created as
_LAG0.
 FREQOPT=

Options for the TABLES statement used in PROC FREQ for the frequencies of
each of lag1lagN vs lag0 (the event variable). The default is
FREQOPT= NOROW
NOCOL NOPERCENT CHISQ.
Arguments pertaining to the nway frequency table:
 OUTFREQ=

Name of the output dataset containing the nway frequency table. The table
is not produced if this argument is not specified.
 COMPLETE=

NO, or ALL specifies whether the nway frequency table is to be made
'complete', by filling in 0 frequencies for lag combinations which do not
occur in the data.
Assume a series of 16 events have been coded with the 3 codes, a, b, c, for
2 subjects as follows:
Sub1: c a a b a c a c b b a b a a b c
Sub2: c c b b a c a c c a c b c b c c
and these have been entered as the 2 variables SEQ (subject) and CODE in
the dataset CODES:
SEQ CODE
1 c
1 a
1 a
1 b
....
2 c
2 c
2 b
2 b
....
Then the macro call:
%lags(data=codes, var=code, by=seq, outfreq=freq);
produces the lags dataset _lags_ for NLAG=1
that looks like this:
SEQ CODE _LAG0 _LAG1
1 c c
a a c
a a a
b b a
a a b
....
2 c c
c c c
b b c
b b b
a a b
....
The output 2way frequency table (outfreq=freq) looks liks this:
SEQ _LAG0 _LAG1 COUNT
1 a a 2
b a 3
c a 2
a b 3
b b 1
c b 1
a c 2
b c 1
c c 0
2 a a 0
b a 0
c a 3
a b 1
b b 1
c b 2
a c 2
b c 3
c c 3
See also
meanplot
panels
scatmat
stat2dat