/*--------------------------------------------------------------------* | Name: mosaics.sas | | Title: IML modules for general n-way mosaic display | | | | This program defines the modules. Use mosaicm.sas to install them | | in a SAS/IML storage catalog. | *--------------------------------------------------------------------* | Original documentation: ``Users Guide to MOSAICS: A SAS/IML | | program for Mosaic Displays'', Dept of Psychology Report 206 | | For current version, see: | | http://www.math.yorku.ca/SCS/mosaics/mosaics.html | *--------------------------------------------------------------------* * Author: Michael Friendly * * Created: 11 Aug 1990 10:27:11 (c) 1990-2003 * * Revised: 31 Mar 2009 08:58:44 * * Version: 3.6-1 * * Changes: * * Added fittype='PARTIAL' * * Stop fitting after last plot * * (2.7) * * Filltype changed to allow separate coding for + and - residuals * * and grayscale shading (filltype='GRAY') * * Colors made global * * (2.9) Added cellfill to print residual symbol * * (3.0) Added MARKOVk fittype; fit equiprobability model for f=1 * * (3.1) Added readtab routine to read freq, labels from a SAS * * dataset; devtype='FT' for Freeman-Tukey residuals; * * (3.2) Added handling of structural zeros * * Changed default values to filltype=HLS, colors={BLUE RED} * * (3.3) Added gskip module, for EPS output. Added &X2 for title * * Added makemap stub, fuzz value for 0 residuals * * (3.4) Added vlabels control, fuzz now sets line style solid. * * Global variables in separate module to make changing defaults * * easier. In transpos module, can specify the variable names * * in the new order, rather than indices. Same for config. * * Added JOINTk and CONDITk models (1<=k<=n) * * (3.5) Fixed conflict between global var DEVTYPE and macro var * * Changed circle blanking for CELLFILL to white/black text * * Added control of threshold for CELLFILL * * Added calculation of adjusted residuals * * Default font now depends on device driver * * Added NAME global for grseg graph names; fixed adj res bug * * Added CELLFILL='FREQ' to display cell frequency * * Added ABBREV=# to abbreviate variable names in model * * (3.6) Added OUTSTAT global variable to generate output data set. * * 'reorder' changed to 'transpos'; added str2vec * * Added GOUT global variable for graphics catalog entry * * Added WINDOW global variable to control window size * * Fixed problem with #MODEL not being substituted * *--------------------------------------------------------------------*/ /*= =Usage: proc iml worksize= symsize= ; reset storage=mosaic.mosaic; load module=_all_; shade = { }; space = { }; verbose = ; *--see global inputs; run mosaic(levels, table, vnames, lnames, plots, title); where: levels Vector of number of levels of each factor. table Table of cell frequencies, as in IPF: first factor varies most rapidly along cols, last factor varies most slowly down the rows. If elements of table are a single variable in a SAS data set (e.g., output from PROC FREQ), with factors={A B C}, the rows (obs.) should be sorted BY C B A to obtain IPF ordering. vnames Vector of factor names, order corresponding to levels lnames Matrix of level names: rows=factors, cols=max(levels) lnames[i,1:levels[i]] gives level names for factor i. plots List of margins to be plotted: vector of any of the integers 1 - ncol(levels). If plots contains i the var1 x var2 x ... var i margin will be plotted. title Character string(s) title for plots. If title is a vector, then title[i] is used for plots = i. If title contains '&MODEL', or #MODEL the model fitted is substituted. Similarly for &G2, #G2, &X2, #X2 =Global input variables: colors Colors used for positive and negative residuals. Default: {BLUE RED}. config IPF-style model configuration for fittype='USER'. Ignored for other fittypes. devtype Type of deviations to be represented by shading. 'GF' calculates components of Pearson goodness of fit chisquare; 'LR' calculates components of likelihood ratio chisquare. 'FT' calculates Freeman-Tukey residuals. devtype='GF' is the default. fittype Type of sequential models to fit: JOINT,MUTUAL,PARTIAL, CONDIT, or USER. If fittype='USER', specify the model in the matrix config. filltype Type of fill pattern to use for shading. A vector of one or two character strings. filltype[1] is used for positive residuals; filltype[2], if present, is used for negative residuals. 'LR' --> patterns Ld, Rd, where d=density value 'M0' --> patterns MdN0, MdN90 'M45' --> patterns MdN135, Md45 [default] 'GRAY'--> patterns GRAYnn 'HLS' --> solid HLS colors, varying in lightness cellfill What to write about residuals in cells with large ones. 'NONE' --> nothing 'FREQ' --> cell frequency 'SIGN' --> draws + or - symbols, # = shading level 'SIZE' --> draws + or - symbols, size ~ shading 'DEV' --> write residual value htext Height of text labels [default: 1.3] font Font for text labels [default: DUPLEX] legend Orientation of legend for shading of residual values in mosaic tiles. Possible values are 'H', 'V', and 'NONE'. Default: 'NONE'. order Specifes whether/how to do a correspondence analysis on each marginal subtable. See the Users Guide. outstat Name of output data set shade Vector of up to 5 values of abs(dev[i]) for boundaries between shading levels. If shade={2 4}; (the default), then shading density is: 0 <= |dev[i]| < 2 -> 0 (empty) 2 <= |dev[i]| < 4 -> 1 4 <= |dev[i]| -> 2 Use shade= a big number to suppress all shading. fuzz Values |dev[i]|< fuzz are outlined in black. space 2-vector of x,y: amount of plotting area reserved for spacing. Typically {20 20}. verbose Controls verbose or detailed output: 'FIT' and/or 'BOX' zeros A 0/1 matrix of the same size/shape as table where 0 indicates that the corresponding value in table is to be treated as missing or a structural zero. =*/ *title 'SAS/IML modules for mosaic displays'; *version(6.06); *-- Requires SAS Version 6.06 or later; start mosaic(levels, table, vnames, lnames, plots, title) global(config, devtype, fittype, filltype, shade, space, split, legend, colors, htext, font, verbose, cellfill, order, zeros, fuzz, vlabels, name, abbrev, outstat, gout, window); print / '+-------------------------------------------+', '| Generalized Mosaic Display, Version 3.6 |', '+-------------------------------------------+'; if rowcatc(type(levels)||type(table)||type(vnames)||type(lnames)) ^= 'NNCC' then do; print 'ERROR: One or more arguments are not defined or of the wrong type'; show levels table vnames lnames; goto done; end; if nrow(vnames)=1 then vnames=vnames`; if nrow(levels)=1 then levels=levels`; print title, vnames levels ' ' lnames; *-- Check conformability of arguments --; f = nrow(vnames)||nrow(lnames); if levels[#] ^= nrow(table)#ncol(table) then do; print 'ERROR: TABLE and LEVEL arguments not conformable'; show levels table; goto done; end; if ^all(f = nrow(levels)) then do; print 'ERROR: VNAMES or LNAMES not conformable with LEVELS'; show levels vnames lnames; goto done; end; run globals; print 'Global options', fittype devtype filltype split shade[f=3.0] colors ,htext font legend cellfill verbose fuzz; if fittype='USER' then do; if type(CONFIG) = 'C' then config = name2num(config, vnames); if type(CONFIG) = 'N' then do; print 'Fitting user specified model', CONFIG; end; else do; print "CONFIG must be specified for fittype='USER'"; goto done; end; end; f = nrow(levels); whichway = num(translate(split,'10','HV')); dir = shape(whichway,f,1); if type(space) ^= 'N' then space = 10# (sum(dir=0) || sum(dir=1)); savspace = space; call gstart(gout); *-- divide the plot into boxes --; run divide(levels, table, vnames, lnames, plots, dir, title); call gstop; space = savspace; done: finish; start globals; *-- Check global inputs, assign default values if not assigned --; * If you dont like these default values, change them; if type(filltype) ^= 'C' then filltype = {HLS HLS}; if type(fittype) ^= 'C' then fittype = 'JOINT'; if type(devtype) ^= 'C' then devtype = 'GF'; /* type of deviations: GF,LR */ if type(shade) ^= 'N' then shade = {2 4}; /* shading levels */ if type(split) ^= 'C' then split = {H V}; /* divide H V H V ... */ if type(htext) ^= 'N' then htext = 1.4; /* height of text labels */ if type(colors) ^= 'C' then colors= {BLUE RED}; if type(legend) ^= 'C' then legend= 'NONE'; if type(cellfill) ^= 'C' then cellfill = 'NONE'; if type(fuzz) ^= 'N' then fuzz = .20; /* fuzz value for 0 residuals */ if type(order) ^= 'C' then order='NONE'; if type(verbose) ^='C' then verbose = 'NONE'; /* verbose output? */ if type(vlabels) ^= 'N' then vlabels = 2; /* variable labels up to 2 vars */ if type(name) ^= 'C' then name='MOSAIC'; if type(abbrev) ^= 'N' then abbrev=0; if type(outstat) ^= 'C' then outstat=''; if type(gout) ^= 'C' then gout='gseg'; if type(window) ^= 'N' then window={-16 -16, 108 108}; *-- Set default font based on device driver name; if type(font ) ^= 'C' then do; call execute('device = upcase("&sysdevic");'); if index(device,'PS') > 0 then font= 'hwpsl009'; /* Helvetica for PS drivers */ else /* if device='WIN' then */ font = 'SWISS'; end; fittype = upcase(fittype); devtype = upcase(devtype); cellfill= upcase(cellfill); colors = upcase(colors); order = upcase(order); split = upcase(split); verbose = upcase(verbose); finish; start divide(levels, table, vnames, names, plots, dir, title) global(config, devtype, fittype , shade, space, verbose, order, zeros, abbrev); *-- start with origin in lower left corner --; length= {100 100}; /* x,y length of box area */ boxes = {0 0} /* lowerleft x,y */ ||( length - space ); /* length x,y */ factors = nrow( levels ); *-- structural zeros or missing values?; if type(zeros) = 'U' then zeros = j(1,levels[#]); else zeros = shape(zeros,1, levels[#]); miss = loc(table = .); if ncol(miss)>0 then table[miss] = 0; miss = loc( row((table = .)) | (zeros = 0) ); missing = (ncol(miss) > 0); tab = shape(table, 1, levels[#]); if missing then tab[miss] = 0; rows = 1; /* number of rows in margins */ do f = 1 to factors while (f <= max(plots) ); whichway =dir[f]; cols = levels[f]; /* number of cols in margins */ reset noname; print "Factor:" f (vnames[f]); reset name; mconfig = (f:1)`; * n sub f, (f-1), ..., 1 ; call marg(loc,margin,levels,tab, mconfig); * margin, with zeros; call marg(loc,mtab,levels,table,mconfig); * margin, original data; call marg(loc,mzero,levels,zeros,mconfig); * marginal zeros; mzero = mzero > 0; *-- Construct row and column labels for marginal table; cl = names[f,]; if f=1 then rl=''; else do; /* construct row labels for prior factors */ ol = repeat(rl, 1, levels[f-1]); ol = shape(ol, levels[f-1]#nrow(rl), 1); nl = repeat((names[f-1,1:(levels[f-1])])`, nrow(rl)); rl = concat(ol,nl); end; margin = (shape(margin, rows)) ; mzero = (shape(mzero, rows)) ; if f=1 then do; * dev = J(1,ncol(margin),0); *-- fit equiprobability model; fconfig = 1; model = compress("(="+vnames[1]+")"); fit = margin[+]/ncol(margin); fit = repeat(fit, 1, ncol(margin)); dev = (margin - fit) / sqrt(fit); end; else do; if fittype='JOINT' then do; fconfig = (1 : (f-1))` || shape(f, f-1, 1, 0); dim = ncol(margin) || rows; call ipf(fit,status,dim,margin,{1 2},mzero); run mfix(mzero,fit,margin,mtab); end; else if substr(fittype,1,5)='JOINT' then do; if f=2 then fconfig= 1:2; else do; if length(fittype)=5 then k=f; else k = num(substr(fittype,6)); fconfig = remove(1:f, k)` || shape(k, f-1, 1, 0); end; run mfit(margin, fconfig, levels, f, fit, status); run mfix(mzero,fit,margin,mtab); end; else if fittype='MUTUAL' then do; fconfig = 1:f; dim = levels[f:1]; call ipf(fit,status,dim,margin,fconfig,mzero); run mfix(mzero,fit,margin,mtab); end; else if substr(fittype,1,6)='CONDIT' then do; if f=2 then fconfig= 1:2; else do; if length(fittype)=6 then k=f; else k = num(substr(fittype,7)); fconfig = remove(1:f, k) // j(1, f-1, k); * fconfig = (1 : (f-1)) // j(1, f-1, k); end; run mfit(margin, fconfig, levels, f, fit, status); run mfix(mzero,fit,margin,mtab); end; else if fittype='PARTIAL' then do; if f=2 then fconfig= 1:2; else fconfig =(1:2) // ((3:f)`|| (3:f)`); run mfit(margin, fconfig, levels, f, fit, status); run mfix(mzero,fit,margin,mtab); end; else if substr(fittype,1,6)='MARKOV' then do; *-- determine order of Markov chain requested; if length(fittype)=6 then k=1; else k = num(substr(fittype,7)); if factors < (k+2) then do; print 'Warning: Not enough factors for order' k 'Markov chain'; if f=2 then fconfig= 1:2; else fconfig =(1:(f-1)) // (2:f); end; else do; if f <= (k+1) then fconfig= 1:f; else do; free fconfig; do i=1 to k+1; fconfig = fconfig // (i:(f-k+i-1)); end; end; end; run mfit(margin, fconfig, levels, f, fit, status); run mfix(mzero,fit,margin,mtab); end; else if fittype='USER' then do; if f=factors then fconfig=config; else fconfig = reduce(config, f); run mfit(margin, fconfig, levels, f, fit, status); run mfix(mzero,fit,margin,mtab); end; if status[1]^=0 then print "IPF ended abnormally", status[c={Error MaxDev Iterations}]; end; *-- Calculate residuals --; if index(devtype, 'GF') /* Pearson residuals */ then do; dtyp='GF'; dev = (margin - fit)/sqrt(fit + (0=fit)); end; else if index(devtype,'LR') /* likelihood ratio resids */ then do; dtyp='LR'; dev = sign(margin-fit) # sqrt(2#( margin # (log((margin+(margin=0)) / (fit+(fit=0)))) -margin+fit )); end; /* Freeman-Tukey resids */ else do; dev = sqrt(margin) + sqrt(margin+1) - sqrt(4#fit + 1); dtyp='FT'; end; if (index(devtype,'ADJ') | index(devtype,'STD')) & any(plots=f) & f>1 then do; * print dtyp 'Residuals', dev[r=rl c=cl format=8.2]; run adjres(levels[1:f], fconfig, shape(fit`,0,1), shape(dev`,0,1), adj); dev = t(shape(adj,0,nrow(fit))); end; *-- Chisquare for current model --; chisq = chisq(margin,fit); df = df(levels[1:f,], fconfig) - sum(mzero=0); if df>0 & all(chisq>.0001) then prob = 1 - probchi(chisq,df); else prob = 1; if f>1 then run modname(fconfig,vnames,model); if any(f=plots) | any(verbose = 'CHISQ') then do; print model df ' ' chisq[c={'ChiSq'} r={'G.F.' 'L.R.'} format=9.3] prob [c={'Prob'} format=8.4]; end; *-- print fit and deviations matrices; reset noname; if any(verbose = 'FIT') then do; print 'Marginal totals' , margin[r=rl c=cl format=8.0]; print 'Fitted frequencies' , fit[r=rl c=cl format=8.2]; end; if prob < .999 & any(f=plots) then do; ltx = {'Pearson residuals' , 'Likelihood Ratio residuals', 'Freeman-Tukey residuals'}[loc(dtyp={GF LR FT})]; if (index(devtype,'ADJ') | index(devtype,'STD')) & f>1 then ltx = 'Adjusted ' + ltx; print (ltx), dev[r=rl c=cl format=8.2]; end; *-- Correspondence analysis to reorder this factor?; if f>1 & order ^= 'NONE' & missing=0 then run corresp( margin, dev, rl, cl); *-- Calculate proportions for each row over row totals, allowing for 0 margins; mar = row( margin ); margin = margin / ( ( margin[,+] + (0=margin[,+]) ) * J(1,levels[f]) ); if any(verbose = 'FIT') then print 'Marginal proportions', margin[r=rl c=cl format=8.4]; *-- Divide boxes for this factor; run divide1(levels, margin, boxes, f, whichway); reset name; if any(verbose = 'BOX') then do; bl = shape( (rl || J(rows,cols-1,' ')),rows#cols,1); print boxes[r=bl c={'BotX' 'BotY' 'LenX' 'LenY'} format=8.2]; end; run space( levels, boxes, f, dir ); run labels( levels, vnames, names, f, dir, boxes, labelx, labels ); if any(verbose = 'BOX') then print labelx[r=labels c={x y 'Angle' 'Height' 'Width'}] ; *-- display the mosaic for current margins ; if any(f=plots) then do; if nrow(title) < f then titl = title[nrow(title),]; else titl = title[f,]; if index(titl, '&MODEL') | index(titl, '#MODEL') > 0 then do; modl = compress(model," ,"); modl = translate(modl,"()()","{}[]"); call change(titl, '&MODEL', modl); call change(titl, '#MODEL', modl); end; if index(titl, '&G2') | index(titl, '#G2')> 0 then do; modl = 'G2 (' + trim(left(char(df,4,0))) + ') = ' + trim(left(char(chisq[2],10,2))) ; call change(titl, '&G2', modl); call change(titl, '#G2', modl); end; else if index(titl, '&X2') | index(titl, '#X2') > 0 then do; modl = 'X2 (' + trim(left(char(df,4,0))) + ') = ' + trim(left(char(chisq[1],10,2))) ; call change(titl, '&X2', modl); call change(titl, '#X2', modl); end; *-- ravel dev by rows, so positions match new boxes; dev = row( dev ); run gboxes( boxes, labels, labelx, dev, f, titl, mar ); run makemap( boxes, labelx, names, levels, dev, mtab, fit, f); run gskip; end; mod = model; run outstat( vnames, names, levels, dev, mtab, fit, f, mod); *-- prepare for next factor; rows = rows * levels[f]; end; /* do f=1 to factors */ finish; start mfit(margin, config, levels, f, fit, status); *-- Fit the model given by config to marginal in margin; mconfig = (f:1)`; * n sub f, (f-1), ..., 1 ; dim = levels[f:1]; rows = nrow(margin); *--turn margin inside-out to match config; call marg(loc,cmargin,dim,margin,mconfig); call ipf(cfit,status,levels[1:f],cmargin,config); *--turn fit inside-out to match margin; call marg(loc,fit,levels[1:f],cfit,mconfig); fit = shape(fit,rows); finish; start mfix(mzero,fit,margin,mtab); *-- Fix up fitted values and margin if missing or structural zeros; * if missing=0 then return; miss = loc(mzero = 0); if ncol(miss) > 0 then do; fit[miss] = mtab[miss]; margin[miss] = mtab[miss]; end; finish; start divide1( levels, margin, boxes, f, whichway ); *-- Divide each old box in proportion to the values in row i of margins; oldbox = boxes; /* save previous box locations */ ngp = nrow(margin); /* number of previous marginal totals */ nit = ncol(margin); /* number of divisions of each such */ free boxes; /* set up to append new divisions */ do i= 1 to ngp; * box we are currently dividing; *-- get coordinates of box to divide; cx = oldbox[i,1]; cy = oldbox[i,2]; lx = oldbox[i,3]; ly = oldbox[i,4]; marg = margin[i,]; p = cusum(marg); /* cumulative proportions */ p = 0//shape(p,nit-1,1); if whichway=1 /* dividing horizontally */ then do; thisbox = repeat(cx,nit,1) || repeat(cy,nit,1) + ( ly # p) || repeat(lx,nit,1) || repeat(ly,nit,1) # marg` ; end; else do; /* dividing vertically */ thisbox = repeat(cx,nit,1) + ( lx # p) || repeat(cy,nit,1) || repeat(lx,nit,1) # marg` || repeat(ly,nit,1) ; end; boxes = boxes // thisbox ; end; finish; start space( levels, boxes, f, dir ) global( verbose, space ); factors= nrow(levels); which = dir[f]; *-- determine space available to each variable ; ndir = sum( dir=0 ) || sum ( dir=1 ); * # splits each way; vspace = space / ndir; loc = 1 + which; /* 1:xspace 2:yspace */ units = 1; do i = 1 to f; if which=dir[i] then units = units * levels[i]; end; units = units-1; rows = nrow(boxes) / levels[f]; scale= J(rows,1) * (0:(levels[f]-1)); unit = vspace[loc] / units; coord= shape(boxes[,loc], rows, levels[f]); coord= coord + scale # unit ; boxes[,loc] = shape(coord,nrow(boxes)); space[loc] = space[loc] - vspace[loc]; if any(verbose = 'BOX') then print space vspace units unit; finish; start gboxes( boxes, labels, labelx, dev, fac, title, freq ) global( shade, verbose, filltype, htext, font, colors, legend, cellfill, fuzz, name, window ); *-- Draw and label the mosaic display --; call gopen(trim(name)+char(fac,1,0),0,title); *-- locate the 4 corners of each box; ll = boxes[,{1 2}]; lr = boxes[,{1 3}][,+] || boxes[,2] ; ul = boxes[,1] || boxes[,{2 4}][,+] ; ur = boxes[,{1 3}][,+] || boxes[,{2 4}][,+]; xy = ll || ul || ur || lr; max = max(ur[,1]) || max(ur[,2]); sf = {100 100} / max; * scale factor: expand to 100; if any(verbose = 'BOX') then print max[c={X Y}] sf[c={X Y}]; * window = {-16 -16, 108 108}; if legend ^= 'NONE' then window[2,]={116 116}; call gwindow(window); *-- Set parameters for filling boxes in various fill types; lines = {1 3}; run fillpat(fcolors,fpattern); centers = ((ll + ur) / 2) * diag(sf); fillmin = max(shade); if cellfill ^= 'NONE' then do; cfill = cellfill; cellfill = scan(cellfill, 1); fillmin = scan(cfill, 2,' '); if fillmin =' ' then fillmin = shade[1]; else fillmin = num(fillmin); dec = scan(cfill, 3,' '); if dec = ' ' then do; if cellfill = 'DEV' then dec = 1; else dec = 0; end; else dec=num(dec); end; free cxy txy hxy; do i=1 to nrow(boxes); bwidth = boxes[i,3]; bheight= boxes[i,4]; box = shape(xy[i,], 4) * diag(sf); *-- Make color and direction of shading reflect deviation from model; index = 1 + (round(dev[i],.001)<0); color = colors[index]; line = lines [index]; if (abs(dev[i])= shade ); * shading density; den = min(max( 0, den ), 5); * 0 <= den <= 5 ; if filltype ='DRAW' then pat='EMPTY'; else do; if den= 0 then pat = 'EMPTY'; * fill pattern ; else do; pat = fpattern[index,den]; fcol = fcolors[index,den]; end; end; *-- Writing something in the cell? Store locations and text; * if cellfill ^= 'NONE' & (den>0 | nrow(boxes)<12) then do; if cellfill ^= 'NONE' & (abs(dev[i])>=fillmin | nrow(boxes)<12) then do; cxy = cxy // centers[i,]; bxy = bxy // (bwidth || bheight); hxy = hxy // den; if cellfill = 'SIGN' then do; txy = txy // substr(({'+++++','-----'}[index]),1,max(1,den)); end; else if cellfill = 'SIZE' then do; txy = txy // ({'+','-'}[index]); end; else if cellfill = 'DEV' then do; txy = txy // compress(char(dev[i], 6, dec)); end; else do; /* cellfill = 'FREQ' */ txy = txy // compress(char(freq[i], 6, dec)); end; end; *-- Draw and fill the box; call gpoly( box[,1], box[,2], line, color, pat, fcol); end; * <---- loop over boxes; if filltype='DRAW' then run fillbox(boxes,dev,sf); *-- Adding cellfill annotations? ; if cellfill ^= 'NONE' & nrow(txy)>0 then do; cht = j(nrow(txy), 1, htext); if cellfill = 'SIGN' then cht = 1.2 # cht; if cellfill = 'SIZE' then cht = cht # hxy; do i=1 to nrow(txy); call gstrlen(wid, txy[i], cht[i]); call gstrlen(dep, '+', cht[i]); w = w // (wid || dep); /* if (wid < bxy[i,1]) & (dep < bxy[i,2]) then do; call gpie(cxy[i,1], cxy[i,2], wid/2, 0, 360, 'WHITE',, 'SOLID'); end; */ cellclr = {'BLACK' 'WHITE'}[1+(hxy[i]>=2)]; sx = cxy[i,1]-wid/2; sy = cxy[i,2]-dep/2; call gscript(sx, sy, txy[i], 0, 0, cht[i], , cellclr); end; if any(verbose='BOX') then do; print "cellfill labels", cxy[f=7.2] txy w[f=7.2] cht bxy[f=6.1]; end; end; do f=1 to nrow(labels); angle = labelx[f,3]; height = labelx[f,4]; if angle = 0 then scale = diag(sf[,1] || {1}); else scale = diag({1} || sf[,2]); lxya = labelx[f,1:2] * scale; labl = labels[f ]; call gscript( lxya[,1], lxya[,2], labl, angle, 0, height, font); end; *-- Draw the title; if length(title)>1 then do; height = max(htext+.1, 1.6); call gstrlen(len, title, height, font); if len > 110 then do; height = height # .1 # int(10 # 110 / len); call gstrlen(len, title, height, font); end; tx = (100 - len)/2; call gscript(tx, 104, title, 0, 0, height, font); end; run glegend(fcolors, fpattern); call gshow; finish; start fillpat(fcolors, fpattern) global(filltype, shade, colors); *-- Set colors and patterns for shading mosaic tiles; ns = ncol(shade); if ns<5 then dens=char({1 3 4 5}[1:ns]`,1,0); else dens={'1' '2' '3' '4' '5'}; fpattern = J(2,ns,'EMPTY '); if ncol(colors)=1 then colors=shape(colors,1,2); colors = substr(colors[,1:2] + repeat(' ',1,2),1,8); fcolors = repeat(colors[1],1,ns) // repeat(colors[2],1,ns); if ncol(filltype)=1 then /* if only one filltype */ do; if filltype='M0' then filltype={M0 M90}; else if filltype='M45' then filltype={M135 M45}; else if filltype='LR' then filltype=cshape(filltype,2,1,1); else filltype=shape(filltype,1,2); end; do dir=1 to 2; * for + and - ; ftype = filltype[dir]; * fill type for this direction; if substr(ftype,1,1)='M' then do; angle = substr(ftype,2); if angle=' ' then angle='0'; if ns=2 then fpattern[dir,] = {M1N M1X}+angle; else fpattern[dir,] = {M1N M3N M3X M4X M5X}[1:ns]`+angle; end; else if ftype='L' | ftype='R' then fpattern[dir,] = trim(ftype)+dens; else if substr(ftype,1,4)='GRAY' then do; if length(ftype)=4 then step=16; else step = num(substr(ftype,5)); dark = 256 - step#(1:ns)`; *-- convert darkness values to hex; fcolors = substr(fcolors,1,max(6,length(fcolors))); fcolors[dir,] = compress('GRAY'+hex(dark)`); fpattern[dir,] = repeat('SOLID',1,ns); end; else if substr(ftype,1,3)='HLS' then do; /* lightness steps for varying numbers of scale values */ /* rows define ramp functions for interpolation with 2-5 levels */ step={ 0 100 0 0 0, 0 40 100 0 0, 0 25 60 100 0, 0 20 45 70 100}[max(1,ns-1),]; /* make step=100 map to '80'x, step=0 map to 75% of the way to 'FF' */ step = 25 + 0.75#step; step = step[1:ns]#255/100; dark = 255-ceil(step/2); lval= hex(dark)`; clr = upcase(colors[dir]); hue = {RED GREEN BLUE MAGENTA CYAN YELLOW, '070' '100' '001' '03C' '12C' '0B4' }; col = loc(hue[1,]=clr); if ncol(col)=0 then col=dir; hval= hue[2,col]; fcolors[dir,] = compress('H'+hval+lval+'FF'); fpattern[dir,] = repeat('SOLID',1,ns); end; end; finish; start hex(num); *-- Convert 0255 then num = mod(num,256); h = rowcat(chars[1+floor(num/16),] || chars[1+mod(num,16),]); return(h); finish; start glegend( fcolors, fpattern ) global( shade, htext, colors, legend, font ); *-- Draw legend indicating color/shading for standardized residuals; if legend='NONE' then return; ns = ncol(shade); values = ( -shade[ns:1]` ) ||{-.01} ||.01 || shade; cval = trim(char(values,3,0)); nv = ncol(values); w = 8; * width of legend box; s = 3; * spacing between boxes; if ns>3 then do; w=7; s=2; end; label = {'Standardized' 'residuals:'}; call gset('font',font); call gset('height',htext); call gstrlen(len,label); if legend = 'H' then do; y = {-11 -16}; x = (100+s-(s+w)#nv) + w #(0:nv); call gscript(min(x)-len-3, y[1]-{2 5},label,0); end; else do; x = {107 112}; y = (100+s-(s+w)#nv) + w #(0:nv); call gscript(x[1]+{2 5}, min(y)-len-3,label,90); end; do i = 1 to nv; sign = 1 + (values[i]<0); line={1 3}[sign]; color = colors[sign]; den = sum( abs(values[i]) >= shade ); if den= 0 then pat = 'EMPTY'; * fill pattern ; else do; pat = fpattern[sign,den]; fcol = fcolors[sign,den]; end; if legend='H' then do; xx = x[i]+ ({0 0}|| w || w) + s#(i-1); yy = y[{1 2 2 1}]; end; else do; yy = y[i]+ ({0 0}|| w || w) + s#(i-1); xx = x[{1 2 2 1}]; end; call gpoly( xx, yy, line, color, pat, fcol); if i=1 then label = '<'+cval[i]; else if i=nv then label = '>'+cval[i]; else if i<=ns+1 then label=cval[i-1]+':'+cval[i]; else label=cval[i]+':'+cval[i+1]; label = compress(label); call gstrlen(len,label); if legend='H' then do; xx = xx[1]+max(0,((xx[3]-xx[1])-len)/2); yy = yy[1]+1; angle = 0; end; else do; yy = yy[1]+max(0,((yy[3]-yy[1])-len)/2); xx = xx[3]+3; angle = 90; end; call gscript(xx,yy,label,angle); *print xx yy label len; end; finish; start labels( levels, vnames, names, f, dir, boxes, labelx, labels ) global( htext, font, verbose, vlabels ); *-- generate positions of labels for this factor; k = levels[f]; which = dir[f]; factors = nrow(levels); loc = which + 1; /* 2=x, 1=y */ box = boxes[(1:k),]; str = shape(names[f,],k,1); line= sum(which=dir[1:f]); call gset('font', font); ht = htext - .1 * (line-1); call gstrlen( len, str, ht ); *-- Position of labels along the baseline, centering if possible; wid = box[,loc+2]; pos = box[,loc] /* center rt justify */ + choose( (len<=wid), (wid-len)/2, (wid-len) ); * end = pos + len; if any(verbose = 'BOX') & any(len/(wid+.01) > 1.5) then print 'Overfull label', str len wid pos; if pos[1]+len[1] > pos[2] then pos[1] = min(pos[1],box[1,loc])-1; do i=2 to nrow(len); /* check for overlap of labels */ if (pos[i] < pos[i-1]+len[i-1]) /* |(pos[i] < box[i,loc]) */ then do; if i= repeat(shade,nrow(d),1) )[,+]; *-- fill each box proportional to abs(fill); w = boxes[,3] # sf[1]; h = boxes[,4] # sf[2]; s = choose( fill=0, 1000, 4 / abs(fill) ); nxl = int((w-.5)/s); nyl = int((h-.5)/s); if any(verbose = 'BOX') then print 'Width, Height, Square', w h s fill nxl nyl; lines = {1 2}; do i = 1 to nrow(boxes); index = 1 + (round(dev[i],.001)<0); color = colors[index]; line = lines [index]; if nxl[i]>0 then do; from =(((boxes[i,1] + (1:nxl[i]) # s[i]))`|| shape(boxes[i,2], nxl[i], 1)) * diag(sf); to = (((boxes[i,1] + (1:nxl[i]) # s[i]))`|| shape(sum(boxes[i,{2 4}]), nxl[i], 1))* diag(sf); call gdrawl( from, to, line, color ); end; if nyl[i]>0 then do; from =(shape(boxes[i,1], nyl[i], 1) || ((boxes[i,2] + (1:nyl[i]) # s[i]))`) * diag(sf); to =(shape(sum(boxes[i,{1 3}]), nyl[i], 1) || ((boxes[i,2] + (1:nyl[i]) # s[i]))`) * diag(sf); call gdrawl( from, to, line, color ); end; end; finish; start reduce(config, f); * find loglin config including only factors 1:f ; con = config; do i=1 to ncol(con); term = con[,i]; if any(term > f) then term = remove(term, loc(term > f)); if ncol(term) = 0 then con[,i] = j(nrow(config), 1, 0); *-- next line would fail if term is now empty; else con[,i] = shape(term, nrow(config), 1, 0); end; *-- delete any all-zero rows and cols; r = con[+,]; con = con[,loc(r>0)]; r = con[,+]; con = con[loc(r>0),]; *-- rearrrange in order of increasing complexity; if ncol(con) > 1 then do; orders = (con ^=0)[+,]; r = rank(orders); noc = con; con[,r] = noc; *-- remove terms which are marginal to those later; keep = j(1,ncol(con)); do i = 1 to ncol(con)-1; ci = con[,i]; if any(ci=0) then ci = remove(ci,loc(ci=0)); do j = i+1 to ncol(con); cj = con[,j]; if any(cj=0) then cj = remove(cj,loc(cj=0)); if ncol(unique(ci,cj)) = ncol(unique(cj)) then keep[i]=0; end; end; con = con[,loc(keep)]; end; /* ncol(con)>1 */ *-- delete any all-zero rows; r = con[,+]; con = con[loc(r>0),]; return(con); finish; start chisq(obs, fit); *-- Find Pearson and likelihood ratio chisquares; gf = sum ( (obs - fit)##2 / ( fit + (fit=0) ) ); lr = 2 # sum ( obs # log ( (obs+(obs=0)) / (fit + (fit=0)) ) ); return (gf // lr); finish; start terms(dim, config); *-- transform ipf config into list of terms in loglinear model; * returns a matrix with ncol(dim) columns. Each row is the indices of one term in the model; nv = nrow(dim); * number of variables; nm = ncol(config); * number of margins in model; max= 2##nv - 1; do i = 1 to max; t = vars_in(i,nv); do j = 1 to nm; c = config[,j]`; *-- are all elements of t contained in this margin? ; if ncol(unique(t,c)) = ncol(unique(c)) then do; terms = terms // shape(t,1,nv,0); goto next; end; end; next: end; return (terms); finish; start vars_in(num,nv); *-- determine variables represented by a number from 1...2##nv-1, considered as a binary number; n = num; do i=1 to nv; if mod(n,2)=1 then r = r || i; n = int(n/2); end; return ( r ); finish; start df(dim,config); if all(config=1) then return(dim[1]-1); terms = terms(dim,config); nc = dim[#]; /* number of cells */ nt = nrow(terms); /* number of marginal terms */ np = 0; *-- find number of parameters fitted in each term; do i = 1 to nt; t = terms[i,]; t = t[,loc(t>0)]; np= np + (dim[t]-1)[#]; end; df = nc - np - 1; return(df); finish; start modname(config,names,model) global(abbrev); *-- Expand IPF config into symbol for loglinear model; free model; brackets = {'{' '}'}; vars = 0; do i = 1 to ncol(config); vars = unique(vars,config[,i]); end; if ncol(vars) > nrow(names) then brackets = {'[' ']'}; do i = 1 to ncol(config); effect = config[,i]; effect = effect[loc(effect>0)]; term = ''; do j = 1 to nrow(effect); if term ^= '' then term=trim(term)+ ','; if abbrev=0 then term = term + names[ effect[j,] ]; else term = term + substr(names[ effect[j,] ],1,abbrev); end; term = brackets[1] + trim(term) + brackets[2]; model= model || term; end; model = rowcatc(model); model = substr(model, 1, length(model)); finish; *-- Module for correspondence analysis for reordering table categories. At present, this analysis merely suggests an ordering, but does not actually reorder the table or the mosaic display; start corresp( margin, dev, rl, cl) global(order, ro, co); r = margin[,+]; c = margin[+,]; n = sum(margin); if (any(order='DEV')) then do; *-- use residuals from current model; d = shape(dev, nrow(margin), ncol(margin)); dpd = t(d)*d / n; end; else do; *-- fit joint independence model for current col variable; e = r * c / n; d = (margin - e) / sqrt(e); dpd = t(d)*d / n; end; call eigen(values, vectors, dpd); k = min(nrow(margin), ncol(margin))-1; * number of non-zero eigenvalues; values = values[1:k]; cancorr = sqrt(values); * singular values = Can R; chisq = n * values ; * contribution to chi-square; percent = 100* values / trace(dpd); cum = cusum(percent); print 'Singular values, and Chi-Square Decomposition',, cancorr [colname={'Singular Values'} format=9.4] chisq [colname={'Chi-Squares'} format=9.3] percent [colname={'Percent'} format=8.2] cum [colname={' Cum % '} format=8.2]; *-- Find Dim1 scores for row/col categories; L = values[1]; U = vectors[,1]; Y = diag(1/sqrt(C/N)) * U * diag(sqrt(L)); X = diag(N/R) * (margin / N) * Y * diag(sqrt(1/L)); d = dev; row = rl; col = cl[,1:ncol(margin)]; *-- sort rows and cols of dev by corresp dimension1; if (any(order='ROW')) then do; ro = rank(X); t = d; d[ ro, ] = t; t = row; row[ ro, ] = t; t = X; X[ ro ] = t; end; if (any(order='COL')) then do; co = rank(Y); t = d; d[, co] = t; t = col; col[ ,co ] = t; t = Y; Y[ co ] = t; Y = t(Y); perm = co`; perm[ , co] = 1:ncol(Y); if ncol(Y)>2 & cum[1]>.5 then print 'Suggested permutation of levels of this variable is', perm[c=col]; end; print 'Residuals, bordered by Row and Column Scores on CA Dimension 1', 'Reordered by' order; print d [r=row c=col format=7.2] X[c={'RowDim1'} f=7.2], Y[r={'ColDim1'} f=7.2]; finish; *-- Modules for data input; /* -------------------------------------------------------------------- Routine to read frequency and index/label variables from a SAS dataset and construct the appropriate levels, and lnames variables Input: dataset - name of SAS dataset (e.g., 'mydata' or 'lib.nydata') variable - name of variable containing frequencies vnames - character vector of names of index variables Output: dim (numeric levels vector) lnames (K x max(dim)) --------------------------------------------------------------------*/ start readtab(dataset, variable, vnames, table, dim, lnames); if type(vnames)^='C' then do; print 'VNAMES argument must be a character vector'; show vnames; return; end; if nrow(vnames)=1 then vnames=vnames`; call execute('use ', dataset, ';'); read all var variable into table; run readlab(dim, lnames, vnames); call execute('close ', dataset, ';'); reset noname; print 'Variable' variable 'read from dataset' dataset, 'Factors: ' (vnames`), 'Levels ordered: ' vnames lnames; reset name; finish; /* Read variable index labels from an open dataset, construct a dim vector and lnames matrix so that variables are ordered correctly for mosaics and ipf (first varying most rapidly). The data set is assumed to be sorted by all index variables. If the observations were sorted by A B C, the output will place C first, then B, then A. Input: vnames (character K-vector) */ start readlab( dim, lnames, vnames); free span lnames dim; nv = nrow(vnames); spc = ' '; do i=1 to nv; vi = vnames[i,]; read all var vi into cli; if type(cli) = 'N' then do; tmp = trim(left(char(cli,8))); tmp = substr(tmp,1,max(length(tmp))); cli = tmp; end; cli = trim(cli); span = span || loc(0=(cli[1,] = cli))[1]; d=design( cli ); dim = dim || ncol(d); free row1; *-- find position of each first distinct value; do j=1 to ncol(d); row1 = row1 || loc(d[,j]=1)[1]; end; * print vi cli d; * print row1; *-- sort elements in row1 so that var labels are in data order; order = rank(row1); tmp = row1; row1[,order]=tmp; li = t(cli[row1]); if i=1 then lnames = li; else do; if ncol(lnames) < ncol(row1) then lnames=lnames || repeat(spc, i-1, ncol(row1)-ncol(lnames)); if ncol(lnames) > ncol(row1) then li = li || repeat(spc, 1, ncol(lnames)-ncol(li)); lnames = lnames // li; end; end; * print span; *-- sort index variables by span so that last varies most slowly; order = rank(span); tmp = span; span[,order] = tmp; tmp = dim; dim[,order] = tmp; tmp = lnames; lnames[order,] = tmp; tmp = vnames; vnames[order,] = tmp; * print dim lnames vnames; finish; *-- backward compatibility [reorder -> transpos] (until the next release); start reorder(dim, table, vnames, lnames, order); run transpos(dim, table, vnames, lnames, order); finish; start transpos(dim, table, vnames, lnames, order); *-- Reorder the dimensions of an n-way table. Order is a permutation of the integers 1:ncol(dim), such that order[k]=i means that dimension k of the array table becomes dimension i of the result. Alternatively, order can be a character vector of the names of variables (vnames) in the new order. * Note: to restore a reordered table to its original form, use the anti-rank of the original order; *-- Use to rearrange the table prior to calling mosaics; if nrow(order) =1 then order=order`; if type(order)='C' then do k=1 to nrow(order); ord = ord // loc(upcase(order[k,]) = upcase(vnames)); end; else ord = order; *-- Dont bother if order = 1 2 3 ... ; if all( row(ord)=1:ncol(row(ord)) ) then return; if nrow(dim ) =1 then dim =dim`; if nrow(vnames)=1 then vnames=vnames`; call marg(loc,newtab,dim,table,ord); table = row(newtab); dim = dim[ord,]; vnames = vnames[ord,]; lnames = lnames[ord,]; finish; start row (x); *-- function to convert a matrix into a row vector; if (nrow(x) = 1) then return (x); if (ncol(x) = 1) then return (x`); n = nrow(x) * ncol(x); return (shape(x,1,n)); finish; /*--------------------------------------------------------------* | IML module to handle multiple output EPS files (or other | | device-dependent multiple plot circumstances). | | - This implementation requires a macro variable, DEVTYP to | | be set. Initialize the FIG variable to 1. *--------------------------------------------------------------*/ *global fig gsasfile devtyp; start gskip; call execute('_dev_ = upcase("&DEVTYP");'); call execute('_disp_ = "&DISPLAY";'); if upcase(trim(_disp_)) ^= 'OFF' then do; if (_dev_ = 'EPS') | (_dev_ = 'GIF') | (_dev_ = 'JPEG') then do; _dev_ = lowcase(_dev_); call execute('%let fig = %eval(&fig + 1);'); call execute('%let gsas = %scan(&gsasfile,1,.)&fig..', _dev_, ';'); call execute('%put NOTE: gsasfile now: &gsas;'); call execute('filename gsas&fig "&gsas";'); call execute('goptions gsfmode=replace gsfname=gsas&fig;'); end; /* else if (_dev_ = 'GIFANIM') then do; call execute('_fig_ = "&FIG";'); call execute('%let fig = %eval(&fig + 1);'); call execute('%let gsas = %scan(&gsasfile,1,.)', 'gif', ';'); call execute('%put NOTE: gsasfile still: &gsas;'); call execute('filename gsas "&gsas";'); call execute('goptions gsfmode=append gsfname=gsas;'); end; */ free _dev_ _disp_; end; finish; /* Dummy makemap module (replaced by weblet) */ start makemap( boxes, labelx, lnames, levels, dev, margin, fit, f) global(legend, verbose); dummy=1; finish; /* Translate a character matrix MAT to a numeric matrix of the indices of each element in a vector of NAMES. Returns a numeric matrix of the same shape as MAT */ start name2num(mat, names); new = j(nrow(mat), ncol(mat), 0); do i=1 to nrow(mat); do j=1 to ncol(mat); l = loc(trim(upcase(mat[i,j])) = upcase(names)); if type(l)^='U' then new[i,j] = l; end; end; return(new); finish; start adjres( dim, config, m, res, adj); *-- Calculate adjusted residuals for a loglin model; run cdesign(dim, config, x ); /* get design matrix */ D = diag(m); S = inv( t(X) * D * X); /* cov(beta) */ V = (sqrt(D) * x)`; *-- compute leverage, faster than vecdiag(V` * S * V); C = t(V)*S ; /* catcher matrix */ lev = (C#t(V))[,+]; /* leverage */ se = sqrt(1 - lev); /* std errors of resids */ adj = res / se; /* adjusted residuals */ * print m[f=6.3] lev[f=6.3] res[f=6.3] adj[f=6.3] se[f=6.3]; finish; start cdesign(dim, config, x); *-- Find (full rank) design matrix, X, from IPF configuration; if nrow(dim)=1 then dim = dim`; terms = terms(dim,config); nc = dim[#]; /* number of cells */ nt = nrow(terms); /* number of marginal terms */ free x; *-- construct cols of X matrix for each term; do i = 1 to nt; t = terms[i,]; t = t[,loc(t>0)]; /* find vars in this term */ xi= 1; do j=1 to nrow(dim); if any(j=t) then do; /* is variable j in the term?*/ xj = designf( (1:dim[j])` ) ; end; else xj = j(dim[j],1) ; xi = xj @ xi; end; x = x || xi; end; x = j( nrow(x), 1) || x; finish; start outstat( vnames, lnames, levels, dev, margin, fit, f, mod) global(outstat); if outstat='' then return; do i=1 to f; cl = lnames[i,]`; if i=1 then do; rl=cl; ml=cl[1:(levels[i])]; end; else do; /* construct row labels for prior factors */ ol = repeat(rl, 1, levels[i]); ml = repeat(ml, (levels[i]), 1); ol = shape(ol, levels[i]#nrow(rl), 1); nl = repeat( (cl[1:(levels[i])]), nrow(rl)); rl = trim(rowcatc(ol || shape(':',nrow(ol),1) ||nl)); ml = ml || nl; end; end; labels = rl+ ' '; residual = shape(dev, nrow(dev)#ncol(dev), 1); fitted = shape(fit, nrow(fit)#ncol(fit), 1); freq = shape(margin, nrow(margin)#ncol(margin), 1); factors = shape(f, nrow(fit)#ncol(fit), 1); model = shape(mod, nrow(fit)#ncol(fit), 1)+' '; vn = rowcat(row(vnames)+' '); vn = vn + ' factors labels residual fitted freq model'; *-- create all the factor variables; do i=1 to ncol(row(vnames)); if i<= ncol(ml) then call execute( vnames[i], '= ml[,i];'); else call execute( vnames[i], '= " ";'); end; *print 'Outstat', factors labels freq fitted residual; if f=1 then do; call execute('%let outstat=', outstat, ';'); /* call execute("create &outstat var{factors labels freq fitted residual model};"); */ call execute("create &outstat var{", vn, "};"); end; append; finish; start str2vec(string,dlm); *-- String to character vector; free out; i=1; sub = scan(string,i,dlm); do while(sub ^=' '); out = out || sub; i = i+1; sub = scan(string,i,dlm); end; return(out); finish;