Michael Friendly

- generalizes readily to n-way tables
- can be used to display the deviations from various log-linear models.

In the * column proportion mosaic *, the width of
each box is proportional to the total frequency in each column of the
table. The height of each box is proportional to the cell frequency,
and the dotted line in each row shows the expected frequencies under
independence. Thus the deviations from independence,

The amount of empty space inside the mosaic plot may make it
harder to see patterns, especially when there are large deviations
from independence. In these cases, it is more useful to separate the
rectangles in each column by a small constant space, rather than
forcing them to align in each row. This is done in the *
condensed mosaic display * Again, the area of each box is
proportional to the cell frequency, and complete independence is
shown when the tiles in each row all have the same height.

Figure 17: Condensed mosaic for Hair-color, Eye-color data. Each column is divided according to the conditional frequency of eye color given hair color. The area of each rectangle is proportional to observed frequency in that cell.

In Hartigan & Kleiner's (1981) original version, all the tiles are unshaded and drawn in one color, so only the relative sizes of the rectangles indicate deviations from independence. We can increase the visual impact of the mosaic by:

- using color and shading to reflect the size of the residual and
- reordering rows and columns to make the pattern more coherent.

Figure 18: Condensed mosaic, reordered and shaded. Deviations from independence are shown by color and shading. The two levels of shading density correspond to standardized deviations greater than 2 and 4 in absolute value. This form of the display generalizes readily to multi-way tables.

The condensed form of the mosaic plot generalizes readily to the display of multi-dimensional contingency tables. Imagine that each cell of the two-way table for hair and eye color is further classified by one or more additional variables--sex and level of education, for example. Then each rectangle can be subdivided horizontally to show the proportion of males and females in that cell, and each of those horizontal portions can be subdivided vertically to show the proportions of people at each educational level in the hair-eye-sex group.

**Complete independence**

The model of complete independence asserts that all joint probabilities are products of the one-way marginal probabilities:(6)

for all*i , j , k*in a three-way table. This corresponds to the log-linear model*[A] [B] [C]*. Fitting this model puts all higher terms, and hence all association among the variables into the residuals.**Joint independence**

Another possibility is to fit the model in which variable*C*is jointly independent of variables*A*and*B*,(7)

This corresponds to the log-linear model is*[ A B ] [ C ]*. Residuals from this model show the extent to which variable*C*is related to the combinations of variables*A*and*B*but they do not show any association between*A*and*B*.

Figure 19: Mosaic display for hair color, eye color, and sex. The categories of sex are crossed with those of hair color, but only the first occurrence is labeled. Residuals from the model of joint independence, [HE] [S] are shown by shading.

Figure 20: Mosaic display for hair color,
eye color, and sex. This display shows residuals from the model of
complete independence, [H] [E] [S], *G*² = 179.79 on 24 df.

For a three-way table, the the hypothesis of complete
independence, * H sub { A otimes B otimes C } * can be
expressed as

(8)where

(9)For example, for the hair-eye data, the mosaic displays for the {Hair} {Eye} marginal table and the [HairEye] [Sex] table can be viewed as representing the partition

Model dfG²{Hair} {Eye} 9 146.44 [Hair, Eye] [Sex] 15 19.86 ------------------------------------------ [Hair] [Eye] [Sex] 24 155.20

This partitioning scheme extends readily to higher-way tables.

[Previous] [Next] [Up] [Top]