1 Answers
Given a probit model y=1 where y* = x1 β + zδ + u, and u ~ N, without losing generality, z can be represented as z = x1 θ1 + x2 θ2 + v. When u is correlated with v, there will be an issue of endogeneity. This can be caused by omitted variables and measurement errors. There are also many cases where z is partially determined by y and endogeneity issue arises. For instance, in a model evaluating the effect of different patient features on their choice of whether going to hospital, y is the choice and z is the amount of the medicine a respondent took, then it is very intuitive that more often the respondent goes to hospital, it is more likely that she took more medicine, hence endogeneity issue arises. When there are endogenous explanatory variables, the estimator generated by usual estimation procedure will be inconsistent, then the corresponding estimated Average Partial Effect will be inconsistent, too.
To address this issue, there are usually two different estimation procedure to generate consistent estimators. Under the normality assumption v~N, u = ρv + ε must hold, where ρ = cov/σ and ε~N. Then the equation for y can be rewritten as y = x1 β + zδ + ρv + ε.
This model can be consistently estimated by 2-Stage Least Square :
1] Regress z on and obtain the consistent estimator θ ^ {\displaystyle {\widehat {\theta }}} and the residual v ^ {\displaystyle {\hat {v}}} ;