Data Analysis using Weka

Description:

Preprocessed Dataset in Weka using Visualization, Normalization and Discretization.
Used Wrapper and Filter method for attribute selection.
Used Naive Bayes, Bayesian net and Support vector machine (SMO) to check the comparative results across the three methods for testing dataset
Used Adboost, Bagging and stacking on top of decision tree for model enhancement.

Key Results:

  • Data Preprocessing Using Weka

Visualization                                                               Normalization

abip11-Visualizationabip12-Normalization

Discretization

abip3-Discretization

  • Advanced Modeling in Weka

Naive Bayes Algorithm:
a    b   <– classified as
1599  228 |    a = 0            Correctly Classified Instances        1721               81.5253 %
162  122 |    b = 1              Incorrectly Classified Instances       390               18.4747 %

BayesNet Algorithm:

  a    b   <– classified as
1622  205 |    a = 0            Correctly Classified Instances        1750               82.8991 %
156  128 |    b = 1            Incorrectly Classified Instances       361               17.1009 %

SMO Algorithm

  a    b   <– classified as
1827    0 |    a = 0        Correctly Classified Instances        1827               86.5467 %
284    0 |    b = 1        Incorrectly Classified Instances       284               13.4533 %         Over fitting

Among three, the least cost of misclassification is of BayesNet Algorithm