The Two-Stage Least Squares Fit can be used in place of the Ordinary Least Squares
in cases where the error term resulting from the regression is
correlated with one of the explanatory variables. Two-Stage
Least
Squares uses instrumental variables to generate proxy variables for the
regression of interest. The regression is performed in two
stages.
Stage 1 determines the proxy variables by performing an ordinary least
squares regression of the variable for which a proxy is being
determined using the instrumental variables as the explanatory
variables in the regression.
Stage 2 preforms an ordinary least squares regression using the
original regression specification, but replacing all non-instrumental
variables with their respective proxies.
Two-Stage Least Squares will eliminate biases occurring in coefficients
due to the violation of the Least Squares Fit assumption of an
uncorrelated error term. However, a bias may still be present
in
the coefficients for small sample sizes.
Usage
To perform a two-stage least squares regression, the appropriate data
must be
chosen. To select a variable as either dependent or
independent, click the check box associated with each in the least
squares window. Only one variable may be chosen as the
dependent variable.
After selecting the desired options from the "Options"
menu, the regression can be performed by selecting "Compute
Regression..." from the "Perform Fit" menu. Once performed, a
text window will open, showing the results of the regression.
After performing the regression, the independent and dependent check
boxes in the least squares window will no longer be changeable.
Also, the "Options" menu associated with this regression will
be deactivated. The results window can be closed at any time;
to reopen the results, simply select the "Compute Regression..." menu
item again.
Options
Variances
The least squares fit allows for two types of variances to be computed
for output. The Ordinary
Least Squares variances are determined from the product of
the variance of the residuals and the solution matrix, (X'X)-1.
The White's Robust Variance Estimates
compute the variances based on a more conservative equation.
Include Constant
By unchecking this box, the constant term is omitted from the
regression procedure.
Ignore Matrix Condition
Selecting this option disables the
safety check for matrix conditioning prior to attempting to solve the
least-squares problem. More information can be found on the Condition page.
The solution matrix tends to appear poorly condition when
colinearity
exists in series. The consequence of proceeding when
conditioning is
poor can range from highly-biased results to complete software failure.
This option, however, tends to be common when dealing with
Two-Stage Least Squares Fits due to the occurrence of correlated
explanatory variables often appearing during the Stage 1 regression(s).
Output
The two-stage least squares fit will generate a table of coefficient values as
well as the errors associated with each. The results of the
fit, including the R2 value are also output.
Independent variables that are not marked as instruments will be labeled with a "_HAT"
suffix, denoting that a proxy variable was used in stage 2 of the
regression process. An example from a least squares fit is shown
below:
Results of the Two-Stage Least Squares Regression
Regression Variable: CO
---------------------------------------------------------------
Sum Squared of the Residuals: 3.3633E04
Standard Error of the Fit : 34.6582
R-Squared Value : 0.99723
Adjusted R-Squared Value : 0.99703
---------------------------------------------------------------
Constant: -145.29034 Std Err: 29.87929 t-Score: -4.86258
Coef 0 (CO_lagged): 0.0308
Std Err: 0.02789
t-Score: 1.10442
Coef 1 (YD_HAT): 0.91049 Std
Err: 0.03041 t-Score:
29.93612
Further information can be generated after the regression. By
selecting "Generate Column Data" from the "Perform Fit" menu, the
estimated dependent variable data will be output as a new variable in
the Data window. The resulting covariance matrix can be
viewed by selecting "View Covariance Matrix..." from the "Supporting
Data Menu." The residuals from the regression can also be
output to the Data window as a new variable by selecting "Output
Residuals to Column" from the "Supporting Data Menu."
Method
The Two-Stage Least Squares Fit makes use of the Least Squares Fit procedure
for each stage. In stage 1, any endogenous,
non-instrument variables are regressed against all instruments.
The resulting regressions from stage 1 are used to generate
complete proxy variables, which are used in the stage 2 regression.
All residuals are recomputed with respect to the true variables
(not the proxy variables); these residuals are used in the calculation
of the covariance matrix and all derived values.