• Stepwise Procedures In Discriminant Analysis

  • CHAPTER ONE -- [Total Page(s) 2]

    Page 1 of 2

    1 2    Next
    • CHAPTER ONE
      INTRODUCTION
      DISCRIMINANT ANALYSIS
      Discriminant Analysis or D.A is a multivariate technique used to classify cases into distinct groups. It separates distinct sets of objects (or observations) and allocates new objects (or observations) to previously defined groups. Discriminant analysis is concerned with the problem of classification, which arises when a researcher having made a number of measurements on an individual, wishes to classify the individual into one of several categories on the basis of these multivariate measurements (Onyeagu, 2003).
      Discriminant analysis will help us analyze the differences between groups and provide us with a means to assign or classify any case into the groups which it most closely resembles.

      There are two aspects of discriminant analysis,
      Predictive Discriminant Analysis (PDA) or Classification, which is concerned with classifying objects into one of several groups and
      Descriptive Discriminant Analysis (DDA) which focused on revealing major differences among the groups (Stevens 1996).
      According to Huberty (1994), Descriptive discriminant analysis includes the collection of techniques involving two or more criterion variables and a set of one or more grouping variables, each with two or more levels. “Whereas in predictive discriminant analysis (PDA) the multiple response variables play the role of predictor variables. In descriptive discriminant analysis (DDA) they are viewed as outcome variables and the grouping variable(s) as the explanatory variable(s). That is, the roles of the two types of variables involved in a multivariate multigroup setting in DDA are reversed from the role in PDA.

      STEPWISE DISCRIMINANT ANALYSIS
      A researcher may wish to discard variables that are redundant (in the presence of other variables) when a large number of variables are available for groups separation. Here (in discriminant analysis), variables (say y’s) are selected and, the basic model does not change. Unlike regression, where independent variables are selected and consequently, the model is altered.
      Stepwise selection is a combination of forward and backward variables selection methods. In forward selection, the variable entered at each step is the one that maximizes the partial F-Statistic based on Wilks’^. The maximal additional separation of groups above and beyond the separation already attained by the other variables is thus obtained. The proportion of these F’s that exceed Fa is greater than a. While in backward selection (elimination), the variable that contributes least is deleted at each step as shown by the partial F.
      The variables which are selected one at a time, and at each step, are re-examined to see if any variable that entered earlier has become redundant in the presence of recently added variables. When the largest partial F among the variables available for entry fails to exceed a preset threshold value, the procedure stops.
      Stepwise discriminant Analysis is a form of discriminant analysis. During the selection process no discriminant functions are calculated. However, after the completion of the subset selection, discriminant function is calculated for the selected variables. These variables can also be used in the construction of classification functions.

      STEPS INVOLVED IN DISCRIMINANT ANALYSIS
      Construct the discriminant function.
      Evaluate the discriminant function for population one (1) by
      substituting the mean values of Xi, X2,     , Xp into Y = LiXi + L2
      X2+... +LPXP, label the value obtained, Y1.
      Repeat step 2 for population two (2) and label the value obtained, Y2.
      Since one is usually greater than the other, assume Y2 > Y1
      Compute the critical value, YC = Y1 + Y2
      Then state the discriminating procedure as; assign the new individual to population one (1) if Y < YC and to population two (2) if Y > YC or Yc < Y.
      GOALS FOR DISCRIMINANT ANALYSIS
      Johnson and Wichern (1992) defined two goals of discriminant analysis as:
      To describe either graphically (in at most three dimensions) or algebraically the differential features of objects (or observations) from several known collections (populations). We try to find discriminants such that the collections are separated as much as possible.
      To sort objects (observations) into two or more labeled classes. The emphasis is on deriving a rule that can be used to optimally assign a new object to the labeled classes. Johnson and Wichern (1992) used the term discrimination to refer to Goal 1 and Classification or Allocation to refer to goal 2.
      The goals of discriminant analysis include identifying the relative contribution of the p variables to separation of the groups and finding the optimal plane on which the points can be projected to illustrate the configuration of the groups.

      EXAMPLES OF DISCRIMINANT ANALYSIS PROBLEMS
      A geologist might wish to classify fossils into their respective categories of fossils groups on the basis of measurements on sizes, shapes and ages of the fossils.
      A doctor may intend to classify new born babies into different categories of blood groups, based on measurement obtained from the blood samples of the babies.
      Students applying for admission into a University are given a common Entrance Examinations (CEE), the vector of their scores in the entrance examination is a set of measurement, X. The problem is to classify a student on the basis of his scores on the entrance examination.
      An automobile Engineer might decide to classify an automobile engine into one of several categories of engine on the basis of measurement of its power output, size and shape.
      A nutritionist might classify food substances into categories of food nutrient as carbohydrate, minerals, water, protein, fat and oil, and vitamin on the basis of measurement on comparative amount of different nutrients in the food.
      As we have seen in the examples above, individuals are assigned to groups taking cognizance of data related to the groups.

      AIMS AND OBJECTIVES OF THE STUDY
      This study is necessary for the following purposes:
      For classification of cases into groups using the stepwise methodologies of discriminant analysis;
      To identify and discard or remove redundant variables or variables which are little related to group distinction;
      To compare the probabilities of misclassification and the hit ratios obtained with discriminant analysis (all independent variables) to that obtained with stepwise procedures.


  • CHAPTER ONE -- [Total Page(s) 2]

    Page 1 of 2

    1 2    Next
    • ABSRACT - [ Total Page(s): 1 ] Abstract Several multivariate measurements require variables selection and ordering. Stepwise procedures ensure a step by step method through which these variables are selected and ordered usually for discrimination and classification purposes. Stepwise procedures in discriminant analysis show that only important variables are selected, while redundant variables (variables that contribute less in the presence of other variables) are discarded. The use of stepwise procedures ... Continue reading---

         

      APPENDIX A - [ Total Page(s): 1 ] ... Continue reading---

         

      APPENDIX B - [ Total Page(s): 1 ] APPENDIX II BACKWARD ELIMINATION METHOD The procedure for the backward elimination of variables starts with all the x’s included in the model and deletes one at a time using a partial  or F. At the first step, the partial  for each xi isThe variable with the smallest F or the largest  is deleted. At the second step of backward elimination of variables, a partial  or F is calculated for each q-1 remaining variables and again, the variable which is th ... Continue reading---

         

      TABLE OF CONTENTS - [ Total Page(s): 1 ]TABLE OF CONTENTSPageTitle PageApproval pageDedicationAcknowledgementAbstractTable of ContentsCHAPTER 1: INTRODUCTION1.1    Discriminant Analysis1.2    Stepwise Discriminant analysis1.3    Steps Involved in discriminant Analysis1.4    Goals for Discriminant Analysis1.5    Examples of Discriminant analysis problems1.6    Aims and Obj ectives1.7    Definition of Terms1.7.1    Discriminant function1.7.2    The eigenvalue1.7.3    Discriminant Score1.7.4    Cut off1.7 ... Continue reading---

         

      CHAPTER TWO - [ Total Page(s): 3 ] 5 is called the mahalanobis (squared) distance for known parameters. For unknown parameters, the Mahalanobis (squared) distance is obtained by estimating p1, p2 and S by X1, X2 and S, respectively. Following the same technique the Mahalanobis (Squared) distance, D , for the unknown parameters is D2 = (X- X)+S-1 (X1- X2) . The distribution of D can be used to test if there are significant differences between the two groups.2.4 WELCH’S CRITERION Welch (1939) suggest ... Continue reading---

         

      CHAPTER THREE - [ Total Page(s): 5 ]The addition of variables reduces the power of Wilks’ Λ test statistics except if the added variables contribute to the rejection of Ho by causing a significant decrease in Wilks’ Λ ... Continue reading---

         

      CHAPTER FOUR - [ Total Page(s): 3 ]CHAPTER FOUR DATA ANALYSISMETHOD OF DATA COLLECTIONThe data employed in this work are as collected by G.R. Bryce andR.M. Barker of Brigham Young University as part of a preliminary study of a possible link between football helmet design and neck injuries.Five head measurements were made on each subject, about 30 subjects per group:Group 1    =    High School Football players Group 2    =    Non-football playersThe five variables areWDIM    =    X1    =    head width at wi ... Continue reading---

         

      CHAPTER FIVE - [ Total Page(s): 1 ]CHAPTER FIVERESULTS, CONCLUSION AND RECOMMENDATIONRESULTSAs can be observed from the results of the analysis, when discriminant analysis was employed, the variable CIRCUM(X2) has the highest Wilks’ lambda of 0.999 followed by FBEYE (X2) (0.959). The variable EYEHD (X4) has the least Wilks’ lambda of 0.517 followed by EARHD (X5) (0.705). Also the least F-value was recorded with the variable CIRCUM (X2) (0.074) followed by the variable FBEYE (X2) (2.474), while the variable EYEHD (X4 ... Continue reading---

         

      REFRENCES - [ Total Page(s): 1 ] REFERENCES Anderson, T.W. (1958). An introduction to multivariate statistical Analysis. John Wiley & Sons Inc., New York. Cohen, J. (1968). Multiple regression as a general data-analytic system. Psychological Bulletin 70, 426-443. Cooley W.W. and Lohnes P.R. (1962). Multivariate procedures for the Behavioural Sciences, New York John Wiley and Sons Inc. Efroymson, M.A. (1960). Multiple regression analysis. In A. Raston & H.S. Wilfs (Eds.) Mathematical methods for ... Continue reading---