H6 Assignment

1. (15 points)

Consider the regression model for question 1 in assignment 5.

A regression model for the energy consumption in the U.S. is considered. The data for 50 States and District of Columbus are available from Statistical Abstract of the United States. Go to http://www.census.gov/compendia/statab/ and click on “State Rankings” on the right side of the page to get following data.
Energy: Energy consumption per capita

Income: Personal income per capita in current collars

Fedaid: Federal aid to state and local governments per capita

House: One-unit detached housing units, percent of total (%)

Unemp: Unemployment rate (%)

White: Percentage of White Population (%) (Need to download it for this assignment)
a. Get the regression printout with Energy as the dependent variable and other five variables as the independent variables.

b. Use the F statistic in the ANOVA table from the Excel printout in part a to conduct the F test with the 5% significance level.

c. Get the regression printout with Energy as the dependent variable, and Fedaid, House and Unemp as the independent variables.

d. Consider the model in part a as the unrestricted model and the model in part c as the restricted model. Conduct an F test and see if you can remove Income and White with the 5% significance level.

e. Use PHStat to get the prediction printout for the regression in part c. Use a 95% confidence interval for Energy with Income=40,000, House=60, and Fedaid = 2000.

f. Multiple the value of Energy by 100 and name the variable as Energy2. Get the regression printout with the Energy2 as the dependent variable, and Fedaid, House and Unemp as the independent variables.

g. Multiple the value of House by 100 and name the variable as House2. Get the regression printout with the Energy as the dependent variable, and Fedaid, House2 and Unemp as the independent variables.

h. Compare the printouts in parts c,f, and g. Explain the changes in the estimated coefficients, the standard errors, t statistics for the estimated coefficients, R-squared and adjusted R-squared.

i. Use “Correlation” in the Data Analysis Tools to get the correlation matrix for all six variables, including Energy.

j. Use the above correlation matrix to check the multicollinearity problem for the unrestricted model in part a.

k. For the restricted model in part c, test the heteroscedasticity problem with the 5% significance level.

l. Explain the regression problems if you find heteroscedasticity.

2. (15 points)

Income: Personal Income, https://research.stlouisfed.org/fred2/series/PI
Emp: All Employees: Total nonfarm, http://research.stlouisfed.org/fred2/series/PAYEMS?cid=32305
Use the data from January 1959 to May 2015 for this problem. Generate the growth for each series, . Name the growth of personal income as GIncome and name the growth of employment as GEmp.
a. Get the regression results with GIncome as the dependent variable and GEmp as the independent variable.

b. Write the regression model and the estimated equation for part a.

c. Explain the meaning of the estimated coefficients of the intercept and the slope for part a.

d. What sign do you expect for the coefficient of the slope for part a? Conduct a one-tailed test on the slope with the 5% significance level.

e. Use PhStat to get the DW test for the regression in part a and test the serial correlation problem with the 5% significance level.

f. Explain the regression problems if you find serial correlation.

g. Use GIncome as the dependent variable and GEmp as the independent variable. Get the regression printouts for the following three models: (1) the distributed lags (DL) model, (2) autoregressive distributed lags (ADL) model, and (3) vector autoregressive (VAR) model. Use 4 lags for each model.

h. For each of the three regression printouts in part g, identify significant coefficients with the two-tailed test and the 5% significance level. Use only one sentence to explain why you pick these variables. Don’t write the four-step test procedure.

i. Draw a line graph of the personal income and a line graph of the growth of personal income.

j. Get the regression printout for the linear trend model with the personal income, not the growth of personal income, as the dependent variable.

k. Predict the values of personal income for June and July 2015 based on the above linear trend model.

l. Get the regression printout for the exponential trend model with the personal income, not the growth of personal income, as the dependent variable.

m. Predict the values of personal income for June and July 2015 based on the above exponential trend model. Note that you need to use “=exp()” function in Excel to convert the predicted values from the estimated equation into the predictions of the level of total employment.

H6 Assignment

1. (15 points)

Consider the regression model for question 1 in assignment 5.

A regression model for the energy consumption in the U.S. is considered. The data for 50 States and District of Columbus are available from Statistical Abstract of the United States. Go to http://www.census.gov/compendia/statab/ and click on “State Rankings” on the right side of the page to get following data.
Energy: Energy consumption per capita

Income: Personal income per capita in current collars

Fedaid: Federal aid to state and local governments per capita

House: One-unit detached housing units, percent of total (%)

Unemp: Unemployment rate (%)

White: Percentage of White Population (%) (Need to download it for this assignment)
a. Get the regression printout with Energy as the dependent variable and other five variables as the independent variables.

b. Use the F statistic in the ANOVA table from the Excel printout in part a to conduct the F test with the 5% significance level.

c. Get the regression printout with Energy as the dependent variable, and Fedaid, House and Unemp as the independent variables.

d. Consider the model in part a as the unrestricted model and the model in part c as the restricted model. Conduct an F test and see if you can remove Income and White with the 5% significance level.

e. Use PHStat to get the prediction printout for the regression in part c. Use a 95% confidence interval for Energy with Income=40,000, House=60, and Fedaid = 2000.

f. Multiple the value of Energy by 100 and name the variable as Energy2. Get the regression printout with the Energy2 as the dependent variable, and Fedaid, House and Unemp as the independent variables.

g. Multiple the value of House by 100 and name the variable as House2. Get the regression printout with the Energy as the dependent variable, and Fedaid, House2 and Unemp as the independent variables.

h. Compare the printouts in parts c,f, and g. Explain the changes in the estimated coefficients, the standard errors, t statistics for the estimated coefficients, R-squared and adjusted R-squared.

i. Use “Correlation” in the Data Analysis Tools to get the correlation matrix for all six variables, including Energy.

j. Use the above correlation matrix to check the multicollinearity problem for the unrestricted model in part a.

k. For the restricted model in part c, test the heteroscedasticity problem with the 5% significance level.

l. Explain the regression problems if you find heteroscedasticity.

2. (15 points)

Income: Personal Income, https://research.stlouisfed.org/fred2/series/PI
Emp: All Employees: Total nonfarm, http://research.stlouisfed.org/fred2/series/PAYEMS?cid=32305
Use the data from January 1959 to May 2015 for this problem. Generate the growth for each series, . Name the growth of personal income as GIncome and name the growth of employment as GEmp.
a. Get the regression results with GIncome as the dependent variable and GEmp as the independent variable.

b. Write the regression model and the estimated equation for part a.

c. Explain the meaning of the estimated coefficients of the intercept and the slope for part a.

d. What sign do you expect for the coefficient of the slope for part a? Conduct a one-tailed test on the slope with the 5% significance level.

e. Use PhStat to get the DW test for the regression in part a and test the serial correlation problem with the 5% significance level.

f. Explain the regression problems if you find serial correlation.

g. Use GIncome as the dependent variable and GEmp as the independent variable. Get the regression printouts for the following three models: (1) the distributed lags (DL) model, (2) autoregressive distributed lags (ADL) model, and (3) vector autoregressive (VAR) model. Use 4 lags for each model.

h. For each of the three regression printouts in part g, identify significant coefficients with the two-tailed test and the 5% significance level. Use only one sentence to explain why you pick these variables. Don’t write the four-step test procedure.

i. Draw a line graph of the personal income and a line graph of the growth of personal income.

j. Get the regression printout for the linear trend model with the personal income, not the growth of personal income, as the dependent variable.

k. Predict the values of personal income for June and July 2015 based on the above linear trend model.

l. Get the regression printout for the exponential trend model with the personal income, not the growth of personal income, as the dependent variable.

m. Predict the values of personal income for June and July 2015 based on the above exponential trend model. Note that you need to use “=exp()” function in Excel to convert the predicted values from the estimated equation into the predictions of the level of total employment.