## Consider the following regression: lnGDPpci = β0 + β1Institutionsi + ui where the dependent variable is ln of GDP per capita, the explanatory variable is a measure

EC395 Applied Econometrics: ln of GDP per capita

Instructions:

You are to complete all questions in this midterm and submit your responses to Gradescope by 11 pm on December 15th.

There are two components to this final exam:

Nine short answer questions listed below.

When submitting to Gradescope you must:

Properly indicate where the response to each question can be found in your submission. Failure to do so will lead to the loss of 3 marks.

Total: 63 marks

Multiple choice: 20 marks, each question is worth 2 marks

End of multiple choice

Beginning of short answer questions

Short answer: 43 marks

1. Consider the following regression:

lnGDPpci = β0 + β1Institutionsi + ui

where the dependent variable is ln of GDP per capita, the explanatory variable is a measure of institutional quality (a higher value implies better quality institutions), and the subscript i represents countries. [7 marks]

a) Draw a scatterplot that demonstrates how this regression would be biased and explain how your scatterplot demonstrates the bias. For simplicity, assume that there are no other sources of bias when creating your scatterplot. Your scatterplot should be clearly labelled and easy to understand. [2 marks]

b) Suppose you estimate the above equation using OLS and the estimated value of β1 is 0.23. Interpret this coefficient. [1 mark]

c) Now, suppose you have a valid instrumental variable. Do you expect the TSLS estimate to be greater than, less than, or the same as the OLS estimate of β1? Explain your answer. [2 marks]

d) Explain whether or not panel data would be useful for addressing simultaneous causality bias in this context. [2 marks]

2. Consider the panel regression equation Yit = βXit + αi + λt + uit. [7 marks]

a) Describe in words what αi represents. [0.5 marks]

b) Describe in words what λt represents. [0.5 marks]

c) Suppose there are only two time periods, i.e., T=2. Describe three ways to estimate this regression in Stata that will provide the same estimate of β. Please provide your code for each way. Assume the following variable names: y for the dependent variable, x for the explanatory variable, id for the entity, and time for the time. [3 marks]

d) Continue to assume that T=2. Would you expect the R2 to be the same across the three methods you described in 2c)? Does this imply anything about which method is preferred? Explain your answer. [3 marks]

3. For this question you will use the dataset “Firms.dta” which is posted under Content -> Final exam. The dataset contains firms that operated in Vietnam in 2001 and 2010. [8 marks]

a) Use a regression to test whether there is a difference in the probability of exit between 2001 and 2010 by employment (use ln employment, not the level of employment). A firm exits if they were in the dataset in 2001 but not in 2010. Please provide your Stata code and a copy of the regression output. [3 marks]

b) Use a regression to test whether the relationship between exit and ln employment varies by ownership. Please provide your Stata code and a copy of the regression output. [2 marks]

c) Interpret the coefficients estimated in (b) in regards to whether or not the relationship between exit and employment varies by owner. [3 marks]

4. Compare the three sets of regression results below. Can you tell using the displayed regression output whether any of the three are internally valid? Explain your answer. [2 marks]

(1) (2) (3)

Coefficient 5.23 7.89 10.01

Standard error (1.02) (6.37) (2.81)

R2 0.23 0.12 0.47

Number of observations 1,262 153,457 312

5. In the following table, enter the values for the entity demeaned values of the variables. [2 marks]

i t Yit Xit

1 1 10 2

1 2 15 4

2 1 22 7

2 2 24 9

6. Consider a perfectly run randomized control trial. Can you estimate the effect of treatment on a specific individual? Explain your answer. [3 marks]

7. Suppose you are interested in estimating the causal effects of community service on earnings. You want to estimate the following regression:

ln(earnings)i = β0 + β1CommSeri + ui

where CommSeri is an indicator variable that takes the value 1 if the individual does community service and 0 otherwise. [5 marks]

a) Consider each of the five threats to internal validity listed in section 9.2 of the textbook. Critically discuss each of those five threats in this context. [2.5 marks]

b) Suppose there is a lottery where the numbers 1 through 100 are randomly drawn for each individual in the population. Individuals with low numbers are told they will have to do community service unless they either make a charitable donation or pursue a graduate degree while individuals with high numbers are told they will not have to do community service. Is this randomly assigned lottery number a valid instrument for CommSeri? [2.5 marks]

8. Use the following table to answer the subsequent questions: [5 marks]

Individual Group Year Y

1 1 2015 10

2 1 2015 12

3 2 2015 8

4 2 2015 9

5 1 2020 20

6 1 2020 18

7 2 2020 12

8 2 2020 10

a) Suppose group 1 is treated by a quasi experiment by 2020, while group 2 is not. Calculate the difference-in-differences estimator. [2 marks]

b) Suppose you have the following additional data

Individual Group Year Y

1 1 2010 1

2 1 2010 5

3 2 2010 4

4 2 2010 8

Does this change your opinion on whether or not the treatment had an effect? Explain your answer. [3 marks]

9. This question is about the paper “The long-run impact of bombing Vietnam” by Edward Miguel and Gerard Roland. You can find a copy of the paper on MyLS under Content -> Final exam. [4 marks]

a) What evidence and or discussion do the authors provide about the relevance of their instrumental variable? [1 mark]

b) What evidence and or discussion do the authors provide about the exogeneity of their instrumental variable? [1 mark]

c) Compare columns (2) and (6) in Table 4. How does the coefficient on the variable of interest change across the two specifications? Is this what you would expect if the instrument was valid? [2 marks]

d) Replicate the regression in column (6) of Table 4. Note that I could not perfectly replicate their reported standard error. I don’t know why, but what I generated was extremely close with the same coefficients. Use the dataset “war_data_district.dta” available on MyLS under Content -> Final Exam. [1 mark]

End of short answer section.

ln of GDP per capita

ln of GDP per capita

ln of GDP per capita

**Are you overwhelmed by your class schedule and need help completing this assignment? You deserve the best professional and plagiarism-free writing services. Allow us to take the weight off your shoulders by clicking this button.**