Logit Regression attempts to fit the logit distribution to the
specified data. Logit is most often used in fitting binomial
distributions where the data of interest can have two states. The logit
function is defined as:
logit(p) = log(p) - log(1-p)
The regression is performed using the Maximum Likelihood Estimator
technique, as described in the Method
section. The coefficients resulting from the regression show the
influence of variables on the binomial probabilities.
Usage
The Logit regression can be started by clicking Logit...
in the Data
Window's Regression menu. The Logit regression dialog appears
below:
To perform a logit regression, the appropriate data must be
chosen. To select a variable as either dependent or independent, click
the check box associated with each. Only one variable may be chosen as
the dependent variable. Note that the dependent variable in a Logit
regression should be binomial, and must lie between 0 and 1.
The Draco Logit regression is performed using an iterative technique.
Prior to executing the Logit regression, the convergence criteria can
be modified from the Options menu. Select Iterations and
Convergence... from the Options menu. The resulting dialog
will allow the setting of the maximum number of iterations and the
convergence criteria; the defaults should be sufficient for most
well-formed problems.
From the Logit regression's menu, choose Compute Regression...
from the Perform Fit menu. This menu item launches an iterative
progress dialog to provide updates on the fitting procedure. When the
dialog signals that computations are complete, press the Close
button to view the results of the Logit regression.
After performing the regression, the independent and dependent check
boxes in the Logit regression window will no longer be changeable.
Also, the "Options" menu associated with this regression will
be deactivated. The results window can be closed at any time;
to reopen the results, simply select the "Compute Regression..." menu
item again.
Output
The Logit regression will generate a table of coefficient values as
well as the errors asociated with each. The results of the
fit, including the R2 value are also output.
An example from a Logit regression is shown below:
Results of the Logit MLE Regression Model
Regression Variable:
Married_Or_Not
Sum Squared of the Residuals:
1.0482E03
Standard Error of the Fit:
0.39149
R-Squared Value:
0.80507
Adjusted R-Squared Value:
0.80504
Coefficient
Value
Std. Err.
t-Score
Constant
0.59012
0.05367
10.99615
Family_Income
.9299E-05
.3187E-06
14.63489
Further information can be generated after the regression. By
selecting Generate Column Data from the Perform Fit
menu, the
estimated dependent variable data will be output as a new variable in
the Data window. The resulting covariance matrix can be
viewed by selecting View Covariance Matrix... from
the Supporting
Data menu. The residuals from the regression can also be
output to the Data window as a new variable by selecting Output
Residuals to Column from the Supporting Data menu.
Method
The Logit Regression is performed using a Maximum Likelihood Estimator
technique. Initial estimates of the parameters are computed from
a simple Least Squares Fit of the data. All subsequent iterations
then perform updates to these parameters in an attempt to drive the
partial derivatives of the likelihood function with respect to the
parameters to zero. Therefore, the estimates of the parameters
are converged when:
The likelihood function for the logit regression is defined as:
The likelihood function in Draco is maximized using a Nelder-Mead optimization algorithm built into the Apache Commons Math library. The derivatives of the likelihood function are numerically computed within the algortihm.
The covariance matrix as computed in the logit regression is
represented by the inverse of the Hessian matrix associated with the
logit function. The Hessian matrix is numerically computed within Draco.