(1) In this exercise, you will conduct a Monte Carlo experiment to study the so-called spurious regression. In a Monte C
Posted: Fri Apr 29, 2022 8:01 am
(1) In this exercise, you will conduct a Monte Carlo experiment to study the so-called spurious regression. In a Monte Carlo study, artificial data are generated using a com- puter algorithm, and then those artificial data are used to calculate the statistics being studied. This makes it possible to compute the distribution of statistics for known models when mathematical expressions for those distributions are complicated (as they are here) or even unknown. In this exercise, you will generate data so that two series, {Y{} and {Xt}, are independently distributed random walks. The specific steps of the algorithm are as follows. (i) Use your computer to generate a sequence of T = 100 i.i.d standard normal random variables. Call these variables V1, V2, ..., V100. Set Y1 = 0.55+ Vi and Y4 = 0.55 + Yt-1 + vt for t = 2,3,...,100. (ii) Use your computer to generate a new sequence, E1, E2, ..., E100, of T = 100 i.i.d standard normal random variables. Set X1 = 0.85 + Ej and X4 = 0.85 + Xt-1 + Et for t = 2,3,...,100. (iii) Regress Y on a constant and X. Compute the OLS estimator, the regression RP, and the (homoskedasticity-only) t-statistic testing the null hypothesis that B1 (the coefficient on X) is 0. Use this algorithm to answer the following questions. (a) Run the algorithm (i) through (iii) once. Use the t-statistic from (iii) to test the null hypothesis that Bı = 0, using the usual 5% critical value of 1.96. What is the R2 of your regression? (b) Repeat (a) 1000 times, saving each value of R2 and the t-statistic. Construct a histogram of the Rº and t-statistic. What are the 5th, 50th, and 95th percentiles of the distributions of the RP and the t-statistic? In what fraction of your 1000 simulated data sets does the t-statistic exceed 1.96 in absolute value? (c) Repeat (b) for different numbers of observations, such as T = 250 and T = 500. As the sample size increases, does the fraction of times that you reject the null hypothesis approach 5%, as it should because you have generated Y and X to be independently distributed? Does this fraction seem to approach some other limit as T gets large? What is that limit?