Thu 8 Nov, 2018 09:29 am
Put yourself in the shoes of an econometrician and imagine that someone else gave the data to you. Suppose, you do not know the true data generating process from Equation (1), but you want to learn as much as possible about it. In particular, you are interested in understanding the relationship between y and x2: e.g. What is the effect of x2 on y?
At first, you ignore the omitted variable and regress y on x2 and x1. In search of the truth, you obtain 2 separate samples of the population of interest, 5000 and 50 observations respectively, and run 2 simple OLS regressions for each sample. In practice, first use the sample of 5000 observations you just created; then, re-generate the same variables or take a 1% sample with 50 observations (use sample).1
Questions (Hint: Chapter 2 of ”Econometric Analysis” by Wooldridge):
4. * Describe the estimates of the population parameters βˆ. What do you notice? Describe and
5. * By increasing the sample size, does the estimate of the effect of x2 on y get closer to the true parameter? Explain mathematically why yes or no.
1You can present multiple regressions in a table by following each regression above with e.g. eststo ols5000 and eststo ols50. The command est table ols5000 ols50, se will then show both models next to each other.
In search of the truth, you realize you face the problem of not observing U. You suspect that your omitted variable is correlated with x2, which will bias the coefficient. You know that in this case you can do better by finding a valid instrument for x2. With this in mind, you obtain a sample of 5000 observations, with the same variables as before, and run an IV regression using z1 as an instrument. In practice, you can either use the same sample of before, or you can re-generate everything again. Nothing changes for the purpose of the exercise.
6. Run the IV regression using ivregress 2sls.
7. * Explain the Durbin-Wu-Hausman test. What is the null hypothesis and the test statistic?
What is it good for?
8. Run this test and interpret the results.
3.3 The Other Instruments
So far we have ignored the existence of 3 further instruments that can help you to deal with our omitted variable bias.
9. Run two additional IV regressions: one with z1, z2, z3 and one with all instruments. Compare
your estimates of all the three IV regressions.
10. * Since now you have more instruments than endogenous regressors, you can run the test of the overidentifying restrictions. What is the null hypothesis and the test statistic? What is it good for?
11. Run this test and interpret using the command estat overid after ivregress 2sls for the two sets of instruments (z1...z3 & z1...z4).