What is the difference between Logistic
Regression and Discriminant Analysis?
What is Logistic Regression?
“Logistic regression allows one to predict a discrete outcome such as group membership from a set of variables that may be continuous, discrete, dichotomous, or a mix.” (Tabachnick and Fidell, 1996, p575)
“Logistic regression allows one to predict a discrete outcome such as group membership from a set of variables that may be continuous, discrete, dichotomous, or a mix.” (Tabachnick and Fidell, 1996, p575)
What is Discriminant Analysis?
“The goal of the discriminant function analysis is to predict group membership from a set of predictors” (Tabachnick and Fidell, 1996, p507)
“The goal of the discriminant function analysis is to predict group membership from a set of predictors” (Tabachnick and Fidell, 1996, p507)
When/How to use Logistic Regression
and Discriminant Analysis?
From the above definitions, it appears that the same research questions can be answered by both methods. The logistic regression may be better suitable for cases when the dependant variable is dichotomous such as Yes/No, Pass/Fail, Healthy/Ill, life/death, etc., while the independent variables can be nominal, ordinal, ratio or interval. The discriminant analysis might be better suited when the dependant variable has more than two groups/categories. However, the real difference in determining which one to use depends on the assumptions regarding the distribution and relationship among the independent variables and the distribution of the dependent variable.
From the above definitions, it appears that the same research questions can be answered by both methods. The logistic regression may be better suitable for cases when the dependant variable is dichotomous such as Yes/No, Pass/Fail, Healthy/Ill, life/death, etc., while the independent variables can be nominal, ordinal, ratio or interval. The discriminant analysis might be better suited when the dependant variable has more than two groups/categories. However, the real difference in determining which one to use depends on the assumptions regarding the distribution and relationship among the independent variables and the distribution of the dependent variable.
So, what is the difference?
Well, for both methods the categories in the outcome (i.e. the dependent variable) must be mutually exclusive. One of the ways to determine whether to use logistic regression or discriminant analysis in the cases where there are more than two groups in the dependant variable is to analyze the assumptions pertinent to both methods. The logistic regression is much more relaxed and flexible in its assumptions than the discriminant analysis. Unlike the discriminant analysis, the logistic regression does not have the requirements of the independent variables to be normally distributed, linearly related, nor equal variance within each group (Tabachnick and Fidell, 1996, p575). Being free from the assumption of the discriminant analysis, posits the logistic regression as a tool to be used in many situations. However, “when [the] assumptions regarding the distribution of predictors are met, discriminant function analysis may be more powerful and efficient analytic strategy” (Tabachnick and Fidell, 1996, p579).
Well, for both methods the categories in the outcome (i.e. the dependent variable) must be mutually exclusive. One of the ways to determine whether to use logistic regression or discriminant analysis in the cases where there are more than two groups in the dependant variable is to analyze the assumptions pertinent to both methods. The logistic regression is much more relaxed and flexible in its assumptions than the discriminant analysis. Unlike the discriminant analysis, the logistic regression does not have the requirements of the independent variables to be normally distributed, linearly related, nor equal variance within each group (Tabachnick and Fidell, 1996, p575). Being free from the assumption of the discriminant analysis, posits the logistic regression as a tool to be used in many situations. However, “when [the] assumptions regarding the distribution of predictors are met, discriminant function analysis may be more powerful and efficient analytic strategy” (Tabachnick and Fidell, 1996, p579).
Even though the logistic regression does not have many
assumptions, thus usable in more instances, it does require larger sample size,
at least 50 cases per independent variable might be required for an accurate
hypothesis testing, especially when the dependant variable has many groups
(Grimm and Yarnold, p. 221). However, given the same sample size, if the
assumptions of multivariate normality of the independent variables within each
group of the dependant variable are met, and each category has the same
variance and covariance for the predictors, the discriminant analysis might
provide more accurate classification and hypothesis testing (Grimm and Yarnold,
p.241). The rule of thumb though is to use logistic regression when the
dependant variable is dichotomous and there are enough samples. [194:604]
Good article for Logistic Regression and Discriminant Analysis post!!!
ReplyDeleteGranular Analytics
Analytics for Micro Markets
Hyper-Local Data
Hyper Local insights
What is the difference between Logistic Regression and Discriminant Analysis useful blog!!
ReplyDeleteIncrease Marketing/ Sales ROI
Increase Marketing Efficiency
Increase ROI in Micro Markets
Increase Marketing Efficiency