Data Analysis using Weka

Description:

Preprocessed Dataset in Weka using Visualization, Normalization and Discretization.
Used Wrapper and Filter method for attribute selection.
Used Naive Bayes, Bayesian net and Support vector machine (SMO) to check the comparative results across the three methods for testing dataset
Used Adboost, Bagging and stacking on top of decision tree for model enhancement.

Key Results:

Data Preprocessing Using Weka

Visualization Normalization

Discretization

Advanced Modeling in Weka

Naive Bayes Algorithm:
a    b   <– classified as
1599 228 |    a = 0            Correctly Classified Instances        1721               81.5253 %
162 122 |    b = 1              Incorrectly Classified Instances       390               18.4747 %

BayesNet Algorithm:

a    b   <– classified as
1622 205 |    a = 0            Correctly Classified Instances        1750               82.8991 %
156 128 |    b = 1            Incorrectly Classified Instances       361               17.1009 %

SMO Algorithm

a    b   <– classified as
1827    0 |    a = 0        Correctly Classified Instances        1827               86.5467 %
284    0 |    b = 1        Incorrectly Classified Instances       284               13.4533 %         Over fitting

Among three, the least cost of misclassification is of BayesNet Algorithm