lags Macro for lag sequential analysis lags

Visualizing Categorical Data: lags

$Version: 1.2 (2 Feb 2001)
Michael Friendly
York University

The lags macro ( [download] get

Macro for lag sequential analysis

Given a variable containing event codes (char or numeric), the LAGS macro creates:

Either or both of these datasets may be used for subsequent analysis of sequential dependencies. One or more BY= variables may be specified, in which case separate lags and frequencies are produced for each value of the BY variables. A WEIGHT= variable may also be specified, giving frequencies weighted by that variable. For example, using the duration of an event as a weight gives 'frequencies' which represent the total number of time units for state sequential data.


One event variable must be specified with the VAR= option. All other options have default values. If one or more BY= variables are specified, lags and frequencies are calculated separately for each combination of values of the BY= variable(s).

The arguments may be listed within parentheses in any order, separated by commas. For example:

  %lags(data=codes, var=event, nlag=2)


The name of the SAS dataset to be lagged. If DATA= is not specified, the most recently created data set is used.
The name of the event variable to be lagged. The variable may be either character or numeric.
The name of one or more BY variables. Lags will be restarted for each level of the BY variable(s). The BY variables may be character or numeric.
Specifies a numeric variable whose value represents the frequency or weight of an observation. The weight values must be non-negative, but need not be integers.
An optional format for the event VAR= variable. If the codes are numeric, and a format specifying what each number means is used (e.g., 1='Active' 2='Passive'), the output lag variables will be given the character values.
Number of lags to compute. Default = 1.
Name of the output dataset containing the lagged variables. This dataset contains the original variables plus the lagged variables, named according to the PREFIX= option.
Prefix for the name of the created lag variables. The default is PREFIX=_LAG, so the variables created are named _LAG1, _LAG2, ..., up to _LAG&nlag. For convenience, a copy of the event variable is created as _LAG0.
Options for the TABLES statement used in PROC FREQ for the frequencies of each of lag1-lagN vs lag0 (the event variable). The default is FREQOPT= NOROW NOCOL NOPERCENT CHISQ.
Arguments pertaining to the n-way frequency table:

Name of the output dataset containing the n-way frequency table. The table is not produced if this argument is not specified.
NO, or ALL specifies whether the n-way frequency table is to be made 'complete', by filling in 0 frequencies for lag combinations which do not occur in the data.


Assume a series of 16 events have been coded with the 3 codes, a, b, c, for 2 subjects as follows:
 Sub1:   c   a   a   b   a   c   a   c   b   b   a   b   a   a   b   c
 Sub2:   c   c   b   b   a   c   a   c   c   a   c   b   c   b   c   c

and these have been entered as the 2 variables SEQ (subject) and CODE in the dataset CODES:

        SEQ    CODE

        1      c
        1      a
        1      a
        1      b
        2      c
        2      c
        2      b
        2      b

Then the macro call:

   %lags(data=codes, var=code, by=seq, outfreq=freq);
produces the lags dataset _lags_ for NLAG=1 that looks like this:

  SEQ    CODE    _LAG0    _LAG1

   1      c        c         
          a        a        c
          a        a        a
          b        b        a
          a        a        b

   2      c        c         
          c        c        c
          b        b        c
          b        b        b
          a        a        b

The output 2-way frequency table (outfreq=freq) looks liks this:

  SEQ    _LAG0    _LAG1    COUNT

   1       a        a        2
           b        a        3
           c        a        2
           a        b        3
           b        b        1
           c        b        1
           a        c        2
           b        c        1
           c        c        0

   2       a        a        0
           b        a        0
           c        a        3
           a        b        1
           b        b        1
           c        b        2
           a        c        2
           b        c        3
           c        c        3

See also