Dummy Variable Use and Interpretation

That's right. Each month's dummy variable is like an "on" switch, capturing returns for that month. Since every return must fall into one of 12 months, she can't use an "on" switch for each one, or there would be nothing left for when all switches are "off."

Suppose a decent dataset of annualized monthly stock returns over the course of a few years is available for a set of firms, with a total of 1,930 observations. Some of the available characteristics of the firms include firm size in billions of US dollars, the P/E ratio, the industry classification, and whether a relatively new CEO is in charge.

Here is what she might have come up with: | $$ \, $$ | Coefficient | Standard Error | _t_-statistic | _p_-value | |---|---|---|---|---| | Intercept | 0.0353 | 0.0013 | 26.80 | < 0.01 | | Jan | 0.0022 | 0.0018 | 1.23 | 0.22 | | Feb | 0.0006 | 0.0018 | 0.33 | 0.74 | | Mar | 0.0012 | 0.0018 | 0.65 | 0.51 | | Apr | 0.0000 | 0.0019 | 0.02 | 0.98 | | May | 0.0026 | 0.0019 | 1.42 | 0.16 | | Jun | 0.0022 | 0.0018 | 1.18 | 0.24 | | Jul | 0.0011 | 0.0019 | 0.57 | 0.57 | | Aug | 0.0012 | 0.0019 | 0.62 | 0.53 | | Sept | 0.0022 | 0.0018 | 1.19 | 0.23 | | Oct | 0.0014 | 0.0018 | 0.77 | 0.44 | | Nov | 0.0002 | 0.0018 | 0.13 | 0.89 |

Now think carefully about her setup of dummy variables here, including the assumptions of dummy variables being zero. What does this tell you about the annualized returns observed in December?

Not so. Actually, the dummy variables for all of the other months are positive; December saw the lowest returns in this dataset.

Well, no. It's not a relative measure. Note that this is an intercept, not a slope coefficient.

You got it. Since all other months were assigned dummy variables, December is the default. With all dummy variables equal to zero, you have just the December returns with no regressor effects. So this coefficient is the mean return. The January coefficient of 0.0022 then tells you that January returns had an average return that was 0.22 percentage points higher. But as is often the case with dummy variables, none of these are significant. There's no statistically significant "month" effect based on these observations. But it was fun to try anyway.

To sum up: [[summary]]

No, this decision isn't about ignoring a month.

For fun (because regressions are always fun), you decide to test how much these characteristics affect returns. For industry classification, the most interesting categories given your firm's investment focus is financial firms and industrials. So you set out to test. $$\displaystyle Ret_t = b_0 + b_1\mbox{Size}_t + b_2\mbox{PE}_t + b_3\mbox{Financial}_t + b_4\mbox{Industrial}_t + b_5\mbox{New CEO}_t + \epsilon_t $$ How many dummy variables will you use?

No. There are five parameters here, but not all include a dummy variable.

Correct! Each firm is either financial or not, industrial or not, and headed by a new CEO or not. Each firm's size and P/E ratios are quantitative variables, and that variation will be used in the regression. Also, you may be remembering the rule of "_n_ situations, _n_ - 1 dummies," but that's for things like the four quarters of the year, where you would have 3 dummy variables for 3 quarters, with no dummy variable for the 4th quarter. Here, these three variables are not three mutually exclusive situations. A firm can be of other classifications than financial or industrial, and the firm classification may or may not coincide with a new CEO. But including a dummy variable for "neither financial nor industrial" or "not a new CEO" would create that problem.

Not quite. There are more dummy variables than this. Consider all variables that are binary in this case.

So the regression is run, and here's what you get: | . | Coefficient | Standard Error | _t_-statistic | _p_-value | |---|---|---|---|---| | Intercept | 0.04316 | 0.001508 | 28.623 | 2.1E-150 | | Size | -0.000027 | 0.000015 | -1.775 | 0.07599 | | PE | -0.000293 | 0.000061 | -2.107 | 0.03525 | | Financial | -0.005623 | 0.000892 | -3.301 | 0.00098 | | Industrial | 0.004884 | 0.000835 | 2.849 | 0.00443 | | New CEO | -0.001496 | 0.000797 | -1.876 | 0.06075 |

How do you interpret the intercept coefficient?

This isn't an average. The average return was in fact 3.7%.

No. This isn't an assumption of mean values for the regressors. Note that you have three dummy variables, and their mean values are largely meaningless decimals.

Exactly! For a non-financial and non-industrial firm with a size of zero, a zero P/E ratio, and a seasoned CEO, the model predicts an annualized monthly stock return of 4.3%. This prediction is outside the range of estimation, so it's not likely to be used. Instead, perhaps a USD 6 billion services firm with a P/E ratio of 10 under an experienced boss is predicted to return $$\displaystyle 0.04316 - 0.000027(6) - 0.000293(10) = 0.0401 = 4.01 \% $$.

From there, how do you interpret the coefficient of -0.005623 on the "Financial" variable?

That's incorrect. This coefficient's meaning is not dependent upon the other two dummy variable values.

Yes! Starting from whatever other values exist for a set of firms, the value of 1 for the financial dummy variable rather than a zero tends to cause about a 0.5 percentage point decrease in return.

Not quite. Consider that this is really a slope value. It shows an expected difference in relative return.

With these results in mind, turn to the results of another regression that your colleague ran using this same dataset. She wanted to test for any month effect in the data, and so she used a model that looked like this: $$\displaystyle Ret_t = b_0 + b_1\mbox{Jan}_t + b_2\mbox{Feb}_t + ... + b_{11}\mbox{Nov}_t + \epsilon_t $$. Why do you think she excluded December from the list of variables?

Actually, there could be.

They averaged 3.53%

They were 3.53 percentage points lower than those in other months

They were 3.53 percentage points higher than those in other months

The average return in the dataset is 4.3%

The model predicts a 4.3% return given mean values of regressors

The model predicts a 4.3% return if all regressors are assumed to have a value of zero

Financial firms are expected to have an annualized return of about -0.5%.

Financial firms' annualized returns are expected to be about 0.5 percentage points less than similar non-financial firms.

Financial firms' annualized returns are expected to be about 0.5 percentage points less than others, assuming that the other firms are non-industrial and don't have a new CEO

There can't be a December effect

December is measured by other dummies being zero

One month must be ignored when using dummy variables

Continue

Dummy Variable Use and Interpretation

The quickest way to get your CFA® charter