# Least squares in two stages (MC2E)

The method of least squares in two stages (LS2E) deals with the problem of the endogeneity of one or more explanatory variables in a multiple regression model.

Its main objective is to avoid that one or more endogenous explanatory variables of a model are correlated with the error term and to be able to make efficient estimates of ordinary least squares (OLS) on the initial model. The tools to use are instrumental variables (VI), structural models and reduced equations.

In other words, MC2E helps us to make an estimate with guarantees when one or more endogenous explanatory variables are correlated with the error term and there is exclusion of exogenous explanatory variables. MC2E refers to the procedure to follow to treat this endogeneity problem.

- In the first stage, a "filter" is applied to eliminate the correlation with the error term.
- In the second stage, the adjusted values are obtained from which good OLS estimates can be made on the reduced form of the original model.

## The structural model

A structural model represents an equation where it is intended to measure the causal relationship between the variables and the focus is on the regressors (βj). Model 1 is a multiple linear regression with two explanatory variables: Y2 and Z1

Model 1 ⇒ Y1 = β0 + β1 · Y2 + β2 · Z1 + u1

Explanatory variables can be divided into two types: endogenous explanatory variables and exogenous explanatory variables. In Model 1, the endogenous explanatory variable is Z1 and the exogenous explanatory variable is Y2. The endogenous variable is given by the model (it is the result of the model) and is correlated with u1. We take the exogenous variable as given (it is necessary for the model to expel a result) and it is not correlated with u1.

## MC2E procedure

In what follows we will explain in detail the procedure to perform an estimation through the method of least squares in two stages.

### First stage

1. We assume that we have two exogenous explanatory variables that are excluded in Model 1, being Z2 and Z3. Remember that we already have an exogenous explanatory variable in Model 1, Z1, therefore, in total we will now have three exogenous explanatory variables: Z1, Z2 and Z3

The exclusion restrictions are:

- Z2 and Z3 do not appear in Model 1, therefore they are excluded.

- Z2 and Z3 are not correlated with the error.

2. We have to obtain the equation in reduced form for Y2. To do this, we substitute:

- The endogenous variable Y1 times Y2.

- The regressors βj by πj.

- The u1 error for v2.

The reduced form for Y2 from Model 1 is:

Y2 = π0 + π1 Z1 + π2 Z2 + π3 Z3 + v2

In the case that Z2 and Z3 are correlated with Y2, the Instrumental Variables (IV) method could be used but we would end up with two IV estimators and in this case the two estimators would be inefficient or imprecise. We say that an estimator is more efficient or accurate the smaller its variance. The most efficient estimator would be the one with the least possible variance.

3. We assume that the previous linear combination is the best Instrumental Variable (VI), we name Y2 * for Y2 and we remove the error (v2) from the equation:

Y2 * = π0 + π1 Z1 + π2 Z2 + π3 Z3 + v2 ∀ π2 ≠ 0, π3 ≠ 0

### Second stage

4. We perform the OLS estimation on the reduced form of Model 1 above and obtain the fitted values (we represent them with the caret “^”). The fitted value is the estimated version of Y2 * which in turn is not correlated with u1.

5. Obtained the previous estimate, it can be used as VI for Y2.

## Process summary

Two-Stage Least Squares Method (LS2E):

- First stage: Regress the circumflex model (point 4) where the adjusted values are obtained. This fitted value is the estimated version of Y2 * and is therefore not correlated with the u1 error. The idea is to apply a non-correlation filter of the fitted value with the u1 error.

- Second stage: Perform OLS regression on the reduced form of Model 1 (point 2) and obtain the fitted values,. Since the fitted value is used and not the original value (Y2), do not panic if the LS2E estimates do not match the OLS estimates on the reduced form of Model 1.

**Tags: **
opinion banks did you know what