Data screening is the condom of data analysis: an important, but frequently overlooked step. While data analysis often focuses on summarization, model fitting, and numbers, data screening emphasizes exposure, preparation for modeling, checking the adequacy of assumptions, and graphical display. Come learn about Safe Stats!This workshop covers a variety of practical aspects of data screening, including:

Examples are presented using SAS software, along with a collection of general-purpose SAS macros for applying some of these techniques to any data set.

- Entering and checking raw data
- Assessing univariate problems (distribution shape, outliers)
- Assessing bivariate problems (linearity, regression diagnostics)
- Assessing multivariate problems (multivariate normality, detecting multivariate outliers)
- Dealing with missing data

- Part 1: Getting started
[ 2-up slides ; 4-up slides (pdf)]
- Entering and checking raw data
- Data entry
- Creating a documented database
- Checking data at input

- Assessing univariate problems
- Boxplots and outliers
- Transformations to symmetry
- Normal probability plots

- Entering and checking raw data
- Part 2: Assessing bivariate problems
[ 2-up slides (pdf); 4-up slides (pdf)]
- Enhanced scatterplots
- Smoothing relations
- Plotting discrete data
- Transformations to linearity
- Dealing with non-constant variance

- Part 3: Multivariate problems and missing data
[ 2-up slides (pdf); 4-up slides (pdf)]
- Assessing multivariate problems
- Multivariate normality
- Multivariate outliers

- Dealing with missing data
- Estimation with missing data (EM algorithms)
- Simple Imputation
- Multiple Imputation

- Assessing multivariate problems
- SAS macro programs: