Heart Analysis: Data Science Project
For our project, we hypothesized that people with high cholesterol have a greater chance of having a heart attack. A heart attack is caused when there is a blockage- preventing blood flow & oxygen to the heart. People with higher levels of cholesterol are more at risk of having a heart attack because this can lead to fatty deposits developing in your blood vessels, which can then result in blood clots. For our data, we wanted to predict if there was a correlation between cholesterol, chest pain, and age. And to predict the accuracy of a heart attack based on these three variables. One of the most common warning signs of a heart attack is chest pain or discomfort in the center of the chest. Therefore, we wanted to predict if there was a relationship between cholesterol levels and chest pain
Github:
https://github.com/elikemk/Heart-Attack-Data-Predict/tree/main
Chart 1 shows our confusion matrix for our testing data. H A stands for heart attacks. At the top we can observe that the number of people who did have Heart attack 29+6 = which is 35 test subjects. Out of the 35 test subjects, 29 which is 82% were accurately classified. At the bottom of the chart 28 which (68)% , were correctly classified as having a Heart attack. After learning this we pruned our classification tree to create a better accuracy for our training dataset.
Pruning our variables improved accuracy by 5% . By plotting the alpha we found the ideal fit for our Decision tree. Observe chart 2, now out of the 35 people who did not have Heart Attacks 85% were classified correctly as not having a Heart attack . Out of the 41 people 75% were classified correctly as having a heart attack.