*** Linear panel data models: static and dynamic framework * The original data is from Ziliak (1997) "Efficient Estimation With Panel * Data when Instruments are Predetermined: An Empirical Comparison of Moment-Condition Estimators" * Journal of Business and Economic Statistics, 15, 419-431 * See also Cameron and Trivedi (2005) * Regression is effect of wage on hours worked: * ln(hours[i,t]) = beta0 + beta1 ln(wage[i,t]) + u[i] + e[i,t] ** Load dataset clear use "/home/federico/work/etrix_allievi/2015-16/panel/panelhours.dta" * basic description of the data describe summarize *re-order vars to easier visualization order id year lnhr lnwg sort id year * description of a panel dataset * xtdes show the structure of the panel (balanced/unbalanced) xtdes * get an error, since xt-commands require you give to Stata unit and time variables * "tsset" of the data is necessary to use built-in commands (NB: it also activates operators for lead-lag and differences) tsset id year xtdes ***** OLS regression with robust+cluster to get robust standard errors reg lnhr lnwg, robust cluster(id) *** However unobserved heterogeneity can lead to biased estimates *** Let us exploit the panel dimension of the data ******** Random effect estimator (FGLS) * Assumptions: * (1) strict exogeneity: x[i,t] and e[i,s] uncorrelated for any t and s * (2) orthogonality condition: x[i,t] uncorrelated with u[i] xtreg lnhr lnwg, re *store estimates for Hausamn test below est store estRE * test LM Breusch-Pagan to test the "existence" of individual effects (compare FGLS and OLS) xttest0 * Random effect under normality of error terms: MLE * Assumptions: * (1) & (2) * plus (3) e[i,t] and u[i] normally distributed xtreg lnhr lnwg, mle ******* Fixed effects WITHIN estimator * Assumptions: * only (1) xtreg lnhr lnwg, fe est store estFE * Equivalently, you could build within-group transformation by hand, and then OLS on transformed variables: by id: egen lnhr_mean = mean(lnhr) by id: egen lnwg_mean = mean(lnwg) gen ytransf= lnhr - lnhr_mean gen xtransf= lnwg - lnwg_mean reg ytransf xtransf * Hausman test (RE vs. FE) hausman estFE estRE ******* Equivalent to Within-FE: add individual-specific dummies --> LSDV * generate id-dummies and then OLS quietly tab id, gen(dummyid) reg lnhr lnwg dummyid*, robust cluster(id) * alternatively, use areg with id in the absorb option areg lnhr lnwg, abs(id) robust cluster(id) est store estLSDV ******** Fixed Effects First differences * Assumption: milder than strict exogeneity: x[i,t] and e[i,s] uncorrelated if s = t-1, t, t+1 * Exploit Diff operator in Stata reg D.lnhr D.lnwg * By construction, the error term is autocorrelated reg D.lnhr D.lnwg, robust cluster(id) ********* Correlated Random Effects * recall that above we have generated lnwg_mean, which is individua-specific mean of lnwg * use that variable as additional regressor xtreg lnhr lnwg lnwg_mean, re * Hausman test here is a test on the significance of the coeff of the added mean-regressor ! test lnwg_mean * The Hausman test can be made robust to the presence of heteroschedasticity or autocorrelation xtreg lnhr lnwg lnwg_mean, re robust cluster(id) test lnwg_mean * We didn't account for economy-wide variations (inflation-cycle-etc...) * This is easily done by including time dummies in the regression * generate time dummies tab year, gen(dyear) xtreg lnhr lnwg dyear*, fe ******** DYNAMIC MODELING * Focus on the pure dynamic model * lnhr[i,t] = beta0 + beta1 lnhr[i,t-1] + tau[t] + u[i] + e[i,t] gen lnhr_1 = L.lnhr * OLS (it is upward biased) reg lnhr lnhr_1 dyear*, robust cluster(id) * FE-WG (it is downward biased) xtreg lnhr lnhr_1 dyear*, fe ****** GENERAL GMM-PANEL: with lagged dependent and endogenous Xs ****** * An unofficial command for GMM estimation: xtabond2 * See Roodman (2006): How to do xtabond2 (http://ideas.repec.org/p/cgd/wpaper/103.html) help xtabond2 * Vars in gmmstyle() are taken with 1-lag and past in building instruments * Vars in ivstyle() are taken with the same t (only once, no past) in building instruments * Arellano-Bond: specify the noleveleq option (prevents the GMM-SYS, which is the default) xtabond2 lnhr l.lnhr dyear*, gmmstyle(l.lnhr) ivstyle(dyear*) robust noleveleq * Blundell-Bond (the default): xtabond2 lnhr l.lnhr dyear*, gmmstyle(l.lnhr) ivstyle(dyear*) robust xtabond2 lnhr l.lnhr dyear*, gmmstyle(l.lnhr) ivstyle(dyear*) robust twostep * restrict the number of lags xtabond2 lnhr l.lnhr dyear*, gmmstyle(l.lnhr, lag(1 3)) ivstyle(dyear*) robust twostep * NB: same as: xtabond2 lnhr l.lnhr dyear*, gmmstyle(lnhr, lag(2 4)) ivstyle(dyear*) robust twostep