Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion

This thesis presents an investigation into detection, classi_cation and tracking of occluded motorcycles from urban tra_c scenes. The _nal aim is to develop an accurate system that allows automatic detection and tracking of motorcycles, which are the most frequent vulnerable user of urban tra_c in e...

Full description

Autores:
Espinosa Oviedo, Jorge Ernesto
Tipo de recurso:
Doctoral thesis
Fecha de publicación:
2019
Institución:
Universidad Nacional de Colombia
Repositorio:
Universidad Nacional de Colombia
Idioma:
spa
OAI Identifier:
oai:repositorio.unal.edu.co:unal/76455
Acceso en línea:
https://repositorio.unal.edu.co/handle/unal/76455
http://bdigital.unal.edu.co/72867/
Palabra clave:
Object detection
motorcycle detection
motorcycle tracking
multiple object tracking
object detection under high level of occlusion
object tracking under high level of occlusion
Deep Learning, Convolutional Neural Networks
Faster R-CNN
Rights
openAccess
License
Atribución-NoComercial 4.0 Internacional
id UNACIONAL2_ad6e26a6ed967d3c93ea2d0dbf0d6cf3
oai_identifier_str oai:repositorio.unal.edu.co:unal/76455
network_acronym_str UNACIONAL2
network_name_str Universidad Nacional de Colombia
repository_id_str
dc.title.spa.fl_str_mv Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion
title Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion
spellingShingle Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion
Object detection
motorcycle detection
motorcycle tracking
multiple object tracking
object detection under high level of occlusion
object tracking under high level of occlusion
Deep Learning, Convolutional Neural Networks
Faster R-CNN
title_short Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion
title_full Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion
title_fullStr Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion
title_full_unstemmed Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion
title_sort Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion
dc.creator.fl_str_mv Espinosa Oviedo, Jorge Ernesto
dc.contributor.advisor.spa.fl_str_mv Velastín Carroza, Sergio Alejandro (Thesis advisor)
dc.contributor.author.spa.fl_str_mv Espinosa Oviedo, Jorge Ernesto
dc.contributor.spa.fl_str_mv Branch Bedoya, John William
dc.subject.proposal.spa.fl_str_mv Object detection
motorcycle detection
motorcycle tracking
multiple object tracking
object detection under high level of occlusion
object tracking under high level of occlusion
Deep Learning, Convolutional Neural Networks
Faster R-CNN
topic Object detection
motorcycle detection
motorcycle tracking
multiple object tracking
object detection under high level of occlusion
object tracking under high level of occlusion
Deep Learning, Convolutional Neural Networks
Faster R-CNN
description This thesis presents an investigation into detection, classi_cation and tracking of occluded motorcycles from urban tra_c scenes. The _nal aim is to develop an accurate system that allows automatic detection and tracking of motorcycles, which are the most frequent vulnerable user of urban tra_c in emerging countries. Operators of urban tra_c surveillance system could enhance the monitoring of this users and even prevent the high accidentally rate that they represent. Initially, a Motorcycle classi_er for urban scenarios is implemented using a pre-trained convolutional neural network for feature extraction. Motorcycles and cars are classi_ed by using the extracted features from a CNN network, and classi_ed using an SVM. The strategy is evaluated in an urban tra_c dataset, achieving a 99.4% accuracy working with three classes, and 99.3% accuracy with _ve classes. Given the good classi_cation results, we move to detection and classi_cations of vehicles in a urban dataset. A hybrid strategy, which combines GMM for object detection and use of CNN for feature extraction and posterior classi_cation, is _rst considered. Then, a two stage detector as Faster R-CNN is used for object detection and classi_cation. The pre-trained Faster R-CNN model achieves an F1 score of 68% outperforming the hybrid model, which achieves 58 %. Based in the good results obtained by a two stage detector as Faster R-CNN, we propose EspiNet, which is a more compact network able to detect and classify motorcycles under high occlusion in congested urban tra_c environments. The method detects and classify motorcycles even under camera movements, objects overlapping and stationary objects. Due to the absence of urban annotated motorcycle datasets, we introduce a new dataset of 7500 and 10,000 annotated images, captured under real tra_c scenes, using a drone mounted camera. The proposed model achieves an F1 Score of 95.3% with an AP of 89.32 %. Overcoming the results of state of the art detectors trained end to end in the introduced Urban Motorbike Dataset (UMD). For benchmark proposes, we compare with a single stage detector such as Yolo v3 and two stage detectors as Faster R-CNN (VGG16 based). The proposed model is used to improve tracking, in a Multiple Object Tracking implementation based on a Markov Decision Process, and in a Deep Learning MOT tracking mechanism. The detection results with a high con_dence hypothesis, improve the tracking processes achieving a Multiple Object Tracking Accuracy (MOTA) of 86.1% and 87.6% respectively, overcoming the state of the art results presented in tracking benchmarks as the used in KITTI dataset. The thesis concludes with a critical analysis of the presented work and a general outlook for future research proposes
publishDate 2019
dc.date.issued.spa.fl_str_mv 2019-07-05
dc.date.accessioned.spa.fl_str_mv 2020-03-30T06:20:17Z
dc.date.available.spa.fl_str_mv 2020-03-30T06:20:17Z
dc.type.spa.fl_str_mv Trabajo de grado - Doctorado
dc.type.driver.spa.fl_str_mv info:eu-repo/semantics/doctoralThesis
dc.type.version.spa.fl_str_mv info:eu-repo/semantics/acceptedVersion
dc.type.coar.spa.fl_str_mv http://purl.org/coar/resource_type/c_db06
dc.type.content.spa.fl_str_mv Text
dc.type.redcol.spa.fl_str_mv http://purl.org/redcol/resource_type/TD
format http://purl.org/coar/resource_type/c_db06
status_str acceptedVersion
dc.identifier.uri.none.fl_str_mv https://repositorio.unal.edu.co/handle/unal/76455
dc.identifier.eprints.spa.fl_str_mv http://bdigital.unal.edu.co/72867/
url https://repositorio.unal.edu.co/handle/unal/76455
http://bdigital.unal.edu.co/72867/
dc.language.iso.spa.fl_str_mv spa
language spa
dc.relation.ispartof.spa.fl_str_mv Universidad Nacional de Colombia Sede Medellín Facultad de Minas Escuela de Sistemas
Escuela de Sistemas
dc.relation.haspart.spa.fl_str_mv 62 Ingeniería y operaciones afines / Engineering
dc.relation.references.spa.fl_str_mv Espinosa Oviedo, Jorge Ernesto (2019) Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion. Doctorado thesis, Universidad Nacional de Colombia - Sede Medellín.
dc.rights.spa.fl_str_mv Derechos reservados - Universidad Nacional de Colombia
dc.rights.coar.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.rights.license.spa.fl_str_mv Atribución-NoComercial 4.0 Internacional
dc.rights.uri.spa.fl_str_mv http://creativecommons.org/licenses/by-nc/4.0/
dc.rights.accessrights.spa.fl_str_mv info:eu-repo/semantics/openAccess
rights_invalid_str_mv Atribución-NoComercial 4.0 Internacional
Derechos reservados - Universidad Nacional de Colombia
http://creativecommons.org/licenses/by-nc/4.0/
http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv openAccess
dc.format.mimetype.spa.fl_str_mv application/pdf
institution Universidad Nacional de Colombia
bitstream.url.fl_str_mv https://repositorio.unal.edu.co/bitstream/unal/76455/1/93390022.2019.pdf
https://repositorio.unal.edu.co/bitstream/unal/76455/2/93390022.2019.pdf.jpg
bitstream.checksum.fl_str_mv e44f7f55743892ca52d052d9ab55a8b6
31642d27fe5a85d7b57a59ef9d784dec
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositorio Institucional Universidad Nacional de Colombia
repository.mail.fl_str_mv repositorio_nal@unal.edu.co
_version_ 1814089347513712640
spelling Atribución-NoComercial 4.0 InternacionalDerechos reservados - Universidad Nacional de Colombiahttp://creativecommons.org/licenses/by-nc/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Branch Bedoya, John WilliamVelastín Carroza, Sergio Alejandro (Thesis advisor)8a685cee-4824-40a4-87a6-feb10dce361d-1Espinosa Oviedo, Jorge Ernesto8639b59e-40f1-4927-913b-efc9fafac4803002020-03-30T06:20:17Z2020-03-30T06:20:17Z2019-07-05https://repositorio.unal.edu.co/handle/unal/76455http://bdigital.unal.edu.co/72867/This thesis presents an investigation into detection, classi_cation and tracking of occluded motorcycles from urban tra_c scenes. The _nal aim is to develop an accurate system that allows automatic detection and tracking of motorcycles, which are the most frequent vulnerable user of urban tra_c in emerging countries. Operators of urban tra_c surveillance system could enhance the monitoring of this users and even prevent the high accidentally rate that they represent. Initially, a Motorcycle classi_er for urban scenarios is implemented using a pre-trained convolutional neural network for feature extraction. Motorcycles and cars are classi_ed by using the extracted features from a CNN network, and classi_ed using an SVM. The strategy is evaluated in an urban tra_c dataset, achieving a 99.4% accuracy working with three classes, and 99.3% accuracy with _ve classes. Given the good classi_cation results, we move to detection and classi_cations of vehicles in a urban dataset. A hybrid strategy, which combines GMM for object detection and use of CNN for feature extraction and posterior classi_cation, is _rst considered. Then, a two stage detector as Faster R-CNN is used for object detection and classi_cation. The pre-trained Faster R-CNN model achieves an F1 score of 68% outperforming the hybrid model, which achieves 58 %. Based in the good results obtained by a two stage detector as Faster R-CNN, we propose EspiNet, which is a more compact network able to detect and classify motorcycles under high occlusion in congested urban tra_c environments. The method detects and classify motorcycles even under camera movements, objects overlapping and stationary objects. Due to the absence of urban annotated motorcycle datasets, we introduce a new dataset of 7500 and 10,000 annotated images, captured under real tra_c scenes, using a drone mounted camera. The proposed model achieves an F1 Score of 95.3% with an AP of 89.32 %. Overcoming the results of state of the art detectors trained end to end in the introduced Urban Motorbike Dataset (UMD). For benchmark proposes, we compare with a single stage detector such as Yolo v3 and two stage detectors as Faster R-CNN (VGG16 based). The proposed model is used to improve tracking, in a Multiple Object Tracking implementation based on a Markov Decision Process, and in a Deep Learning MOT tracking mechanism. The detection results with a high con_dence hypothesis, improve the tracking processes achieving a Multiple Object Tracking Accuracy (MOTA) of 86.1% and 87.6% respectively, overcoming the state of the art results presented in tracking benchmarks as the used in KITTI dataset. The thesis concludes with a critical analysis of the presented work and a general outlook for future research proposesResumen: Esta tesis presenta una investigación sobre la detección, clasificación y seguimiento de motocicletas en escenarios de trancón urbano con un alto nivel de oclusión. El objetivo es poder desarrollar un sistema preciso que permita la detección y el seguimiento automático de motocicletas, que resultan ser los usuarios más vulnerables, constantemente expuestos a accidentes en el tráfico urbano en los países emergentes. Los operadores de los sistemas de vigilancia de tráfico urbano podrían mejorar el monitoreo de estos usuarios e incluso evitar la alta tasa de accidentalidad que presentan. Inicialmente, se implementa un clasificador de motocicletas en escenarios urbanos utilizando un modelo de red neuronal convolucional pre-entrenada, usada para la extracción de características. Este modelo clásico motocicletas, automóviles y el entorno urbano utilizando las características extraídas de la red CNN y evaluadas por una máquina de soporte vectorial (SVM). La estrategia se evalúa en un conjunto de datos de tráfico urbano, logrando un 99.4% de precisión, con un dataset constituido por tres clases y 99.3% de precisión cuando el dataset es ampliado a cinco clases. Dados los buenos resultados de clasificación, nos enfocamos después en la detección y clasificación de vehículos en un conjunto de datos urbano. En primera instancia, se compara una estrategia híbrida que combina GMM para la detección de objetos y el uso de CNN para la extracción de características, evaluadas para su posterior clasificación. En segunda instancia, se utiliza un detector de dos etapas denominado Faster R-CNN, que es usado para la detección y clasificación de objetos. El modelo pre-entrenado de Faster R-CNN alcanza un puntaje de F1 de 68% superando al modelo híbrido, que solo logra el 58 %. Basados en los buenos resultados obtenidos por el detector de dos etapas (Faster R-CNN), desarrollamos \EspiNet", que es una red compacta capaz de detectar y clasificar motocicletas en imágenes con alto nivel de oclusión en entornos de tráfico urbano congestionados. El método detecta y clasifica las motocicletas incluso en imágenes capturadas con movimientos de cámara, objetos superpuestos y objetos estacionarios. Debido a que al fecha de esta investigación, no existen conjuntos de datos de motocicletas en entornos urbanos debidamente anotadas, presentamos un nuevo conjunto de 7500 y 10,000 imágenes, capturadas en escenas de tráfico urbano real, utilizando una cámara montada en un dron y que han sido debidamente anotadas para la generación de Ground Truth. Aplicada sobre este conjunto de datos, EspiNet alcanza un puntaje de F1 de 95.3% con una Precisión Promedio (AP) de 89.32 %. Este modelo, supera los resultados de detectores estado del arte, que han sido entrenados en su totalidad para esta investigación utilizando el conjunto de datos mencionado. A manera de referencia, se utilizan dos ejemplos de detectores estado del arte, de etapa _única como Yolo v3 y detectores de dos etapas como Faster R-CNN (basado en VGG16). Finalmente, el modelo propuesto se utiliza para mejorar el seguimiento (tracking), en una implementación de Seguimiento Multi Objeto basada en un Proceso de Decisión de Markov y una implementación de MOT basada en deep learning. Los resultados de detección, con una hipótesis de alta confianza (proveidos por EspiNet), mejoran notablemente el proceso de seguimiento, logrando una Precisión de seguimiento de objetos múltiples (MOTA) de 86.1%y 87.6% respectivamente, superando los resultados en el estado del arte, presentados por ejemplo en el benchmark de seguimiento utilizando el conjunto de datos KITTI. La tesis concluye con un análisis crítico del trabajo presentado y una perspectiva general para un trabajo futuro de investigaciónDoctoradoapplication/pdfspaUniversidad Nacional de Colombia Sede Medellín Facultad de Minas Escuela de SistemasEscuela de Sistemas62 Ingeniería y operaciones afines / EngineeringEspinosa Oviedo, Jorge Ernesto (2019) Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion. Doctorado thesis, Universidad Nacional de Colombia - Sede Medellín.Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussionTrabajo de grado - Doctoradoinfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/acceptedVersionhttp://purl.org/coar/resource_type/c_db06Texthttp://purl.org/redcol/resource_type/TDObject detectionmotorcycle detectionmotorcycle trackingmultiple object trackingobject detection under high level of occlusionobject tracking under high level of occlusionDeep Learning, Convolutional Neural NetworksFaster R-CNNORIGINAL93390022.2019.pdfTesis de Doctorado en Ingeniería - Sistemasapplication/pdf13635491https://repositorio.unal.edu.co/bitstream/unal/76455/1/93390022.2019.pdfe44f7f55743892ca52d052d9ab55a8b6MD51THUMBNAIL93390022.2019.pdf.jpg93390022.2019.pdf.jpgGenerated Thumbnailimage/jpeg4739https://repositorio.unal.edu.co/bitstream/unal/76455/2/93390022.2019.pdf.jpg31642d27fe5a85d7b57a59ef9d784decMD52unal/76455oai:repositorio.unal.edu.co:unal/764552024-07-12 23:33:00.285Repositorio Institucional Universidad Nacional de Colombiarepositorio_nal@unal.edu.co