Can I create interaction terms in my logistic model, as with OLS regression?
Yes. As in OLS regression, interaction terms are constructed as crossproducts of the two interacting variables.
How are interaction effects handled in logistic regression?
The same as in OLS regression. One must add interaction terms to the model as crossproducts of the standardized independents and/or dummy independents. Some computer programs will allow the researcher to specify the pairs of interacting variables and will do all the computation automatically. In SPSS, use the categorical covariates option: highlight two variables, then click on the button that shows >a*b> to put them in the Covariates box .The significance of an interaction effect is the same as for any other variable, except in the case of a set of dummy variables representing a single ordinal variable.
When an ordinal variable has been entered as a set of dummy variables, the interaction of another variable with the ordinal variable will involve multiple interaction terms. In this case the significance of the interaction of the two variables is the significance of the change of R-square of the equation with the interaction terms and the equation without the set of terms associated with the ordinal variable. (See the StatNotes section on "Regression" for computing the significance of the difference of two R-squares).
ASSUMPTIONS:
Centered variables. As in OLS regression, centering may be necessary either to reduce multicollinearity or to make interpretation of coefficients meaningful. Centering is almost alway recommended for independent variables which are components of interaction terms in a logistic model. See the full discussion in the section on OLS regression, here.
COLLINEARITY
Is multicollinearity a problem for logistic regression the way it is for multiple linear regression?
Absolutely. The discussion in "Statnotes" under the "Regression" topic is relevant to logistic regression.
What is the logistic equivalent to the VIF test for multicollinearity in OLS regression? Can odds ratios be used?
Multicollinearity is a problem when high in either logistic or OLS regression because in either case standard errors of the b coefficients will be high and interpretations of the relative importance of the independent variables will be unreliable. In an OLS regression context, recall that VIF is the reciprocal of tolerance, which is 1 - R-squared. When there is high multicollinearity, R-squared will be high also, so tolerance will be low, and thus VIF will be high. When VIF is high, the b and beta weights are unreliable and subject to misinterpretation. For typical social science research, multicollinearity is considered not a problem if VIF <= 4, a level which corresponds to doubling the standard error of the b coefficient. As there is no direct counterpart to R-squared in logistic regression, VIF cannot be computed -- though obviously one could apply the same logic to various psuedo-R-squared measures. Unfortunately, I am not aware of a VIF-type test for logistic regression, and I would think that the same obstacles would exist as for creating a true equivalent to OLS R-squared. A high odds ratio would not be evidence of multicollinearity in itself. To the extent that one independent is linearly or nonlinearly related to another independent, multicollinearity could be a problem in logistic regression since, unlike OLS regression, logistic regression does not assume linearity of relationship among independents. Some authors use the VIF test in OLS regression to screen for multicollinearity in logistic regression if nonlinearity is ruled out. In an OLS regression context, nonlinearity exists when eta-square is significantly higher than R-square. In a logistic regression context, the Box-Tidwell transformation and orthogonal polynomial contrasts are ways of testing linearity among the independents.
Prihlásiť na odber:
Zverejniť komentáre (Atom)
Žiadne komentáre:
Zverejnenie komentára