Skip to Content

A Smart Guide to Dummy Variables: Four Applications and A Macro

Dummy variables are variables that take the values of only 0 or 1. They may be explanatory or outcome variables; however, the focus of this article is explanatory or independent variable construction and usage. Typically, dummy variables are used in the following applications: time series analysis with seasonality or regime switching; analysis of qualitative data, such as survey responses; categorical representation, and representation of value levels. Target domains may be economic forecasting, bio-medical research, credit scoring, response modeling, and other fields. Dummy variables may serve as inputs in traditional regression methods or new modeling paradigms, such as genetic algorithms, neural networks, or Boolean network models.

Coding techniques include "1-of-N" and "thermometer" encoding. Statistical properties of dummy variables in each of the traditional usage and application contexts are discussed, and a more detailed introduction of a Boolean network model is presented. Because conversion of categorical data to dummy variables often requires time-consuming and tedious recoding, a SAS macro is offered to facilitate the creation of dummy variables and improve productivity.

Download
A Smart Guide to Dummy Variables: Four Applications and A Macro