Using Data-Mining Techniques for the Prediction of the Severity of Road Crashes in Cartagena, Colombia

Objective: Analyze the road crashes in Cartagena (Colombia) and the factors associated with the collision and severity. The aim is to establish a set of rules for defining countermeasures to improve road safety. Methods: Data mining and machine learning techniques were used in 7894 traffic accidents...

Full description

Autores:
Tipo de recurso:
Fecha de publicación:
2019
Institución:
Universidad Tecnológica de Bolívar
Repositorio:
Repositorio Institucional UTB
Idioma:
eng
OAI Identifier:
oai:repositorio.utb.edu.co:20.500.12585/9195
Acceso en línea:
https://hdl.handle.net/20.500.12585/9195
Palabra clave:
Data mining
Prediction
Road crashes
Severity
Decision trees
Forecasting
Highway accidents
Motor transportation
Motorcycles
Roads and streets
Support vector machines
Area under the ROC curve
Classification algorithm
Knowledge analysis
Machine learning techniques
Multi layer perceptron
Road crash
Severity
Support vector machine (SVMs)
Data mining
Rights
restrictedAccess
License
http://creativecommons.org/licenses/by-nc-nd/4.0/
Description
Summary:Objective: Analyze the road crashes in Cartagena (Colombia) and the factors associated with the collision and severity. The aim is to establish a set of rules for defining countermeasures to improve road safety. Methods: Data mining and machine learning techniques were used in 7894 traffic accidents from 2016 to 2017. The severity was determined between low (84%) and high (16%). Five classification algorithms to predict the accident severity were applied with WEKA Software (Waikato Environment for Knowledge Analysis). Including Decision Tree (DT-J48), Rule Induction (PART), Support Vector Machines (SVMs), Naïve Bayes (NB), and Multilayer Perceptron (MLP). The effectiveness of each algorithm was implemented using cross-validation with 10-fold. Decision rules were defined from the results of the different methods. Results: The methods applied are consistent and similar in the overall results of precision, accuracy, recall, and area under the ROC curve. Conclusions: 12 decision rules were defined based on the methods applied. The rules defined show motorcyclists, cyclists, including pedestrians, as the most vulnerable road users. Men and women motorcyclists between 20–39 years are prone in accidents with high severity. When a motorcycle or cyclist is not involved in the accident, the probable severity is low. © 2019, Springer Nature Switzerland AG.