In continuation of my previous blog, I have done the analysis on the latest sheet which includes the latitude and longitude data. I looked at the data and found some information is missing. There are 5,362 empty spots, which is almost 4% of all the data. The “race” column has the most missing information with 1,517 empty spots. Other columns like “flee”, “latitude”, and “longitude” also have a lot of missing information. This might make analyzing the data or making predictions with it a bit tricky and we might need to fill in the gaps carefully.
In the data we have, people’s ages range from 2 to 92 years old, with an average (mean) age of 37.29 years. Latitude and longitude numbers tell us where events happened all over the U.S and few outside of the county which should be eliminated. The data covers 2,963 different days, with the day having the most events (9) being February 1, 2018. Talking about the type of threat, “shoot” was mentioned most, in 2,461 incidents. Regarding whether people were running away (“fleeing”) during the incidents, 4,703 times they were not. Lastly, in 5,082 incidents, a gun was involved.