Data Analysis on Bird Strikes in USA using R

Description:

Association rules using Apriori algorithm in finding out recurring patterns consisting of set of attributes like airline company, height above sea level etc.
Partition clustering using k-means algorithm in order to group similar cases in one cluster
Classification using decision tree induction to classify cases of aircraft casualties with the help of qualifier prepared on the basis of training data with known categories of cases.

Key Results:

Apriori Algorithm :

Cause of Damage
Wildlife Species : Turkey Vulture & Canada goose

bi3

 

 

 

Logistic Regression:

Significant Variables
Airport_Name
Wildlife Species
Phase of Flight
Pilot Warned
Feet above ground

Decision Tree:

We tried predicting values for Cause Damage and No damage with Dependent Variable – Effect_Indicated_Damage.

bip1

 

 

 

 

 

Clustering:

We used K-Means Partition Clustering with 4 cluster sizes. Each cluster is similar to other instances in same cluster and different from instances in other cluster on basis of “Feet_Above_Ground” and “Wildlife_Species” variable.

bip4

 

 

 

 

 

 

 

 

Used – R Studio, MS Excel 2013

Collaborators: Rohan Ashok Patil, Valay Raval, Ishan Dindorkar, Aditi Saluja, Divya Vanacharla

Leave a comment