Exploration of a ViT-based multimodal approach to Vehicle Accident Detection
Multimodal Deep Learning (MMDL) has emerged as a potent framework for synthesizing information from diverse data sources, enhancing the capability of models to understand and predict complex phenomena. Particularly, Vision Transformers (ViT) have shown promising results in processing visual data alo...
- Autores:
-
Ríos Pérez, Jesús David
- Tipo de recurso:
- https://vocabularies.coar-repositories.org/resource_types/c_7a1f/
- Fecha de publicación:
- 2024
- Institución:
- Universidad del Magdalena
- Repositorio:
- Repositorio Unimagdalena
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.unimagdalena.edu.co:123456789/21215
- Acceso en línea:
- https://repositorio.unimagdalena.edu.co/handle/123456789/21215
- Palabra clave:
- Multimodal, Machine Learning, Data Fusion, Deep Learning.
Multimodalidad, Aprendizaje de máquinas, Fusión de datos, Aprendizaje profundo.
- Rights
- openAccess
- License
- Acceso Abierto
| id |
UNIMAGDALE_c117eeb271af41a8cc250b22f92c3520 |
|---|---|
| oai_identifier_str |
oai:repositorio.unimagdalena.edu.co:123456789/21215 |
| network_acronym_str |
UNIMAGDALE |
| network_name_str |
Repositorio Unimagdalena |
| repository_id_str |
|
| dc.title.spa.fl_str_mv |
Exploration of a ViT-based multimodal approach to Vehicle Accident Detection |
| dc.title.alternative.none.fl_str_mv |
Exploración de un enfoque multimodal basado en ViT para la Detección de Accidentes Vehiculares |
| title |
Exploration of a ViT-based multimodal approach to Vehicle Accident Detection |
| spellingShingle |
Exploration of a ViT-based multimodal approach to Vehicle Accident Detection Multimodal, Machine Learning, Data Fusion, Deep Learning. Multimodalidad, Aprendizaje de máquinas, Fusión de datos, Aprendizaje profundo. |
| title_short |
Exploration of a ViT-based multimodal approach to Vehicle Accident Detection |
| title_full |
Exploration of a ViT-based multimodal approach to Vehicle Accident Detection |
| title_fullStr |
Exploration of a ViT-based multimodal approach to Vehicle Accident Detection |
| title_full_unstemmed |
Exploration of a ViT-based multimodal approach to Vehicle Accident Detection |
| title_sort |
Exploration of a ViT-based multimodal approach to Vehicle Accident Detection |
| dc.creator.fl_str_mv |
Ríos Pérez, Jesús David |
| dc.contributor.advisor.none.fl_str_mv |
Sánchez Torres, Germán Henriquez Miranda, Carlos Nelson |
| dc.contributor.author.none.fl_str_mv |
Ríos Pérez, Jesús David |
| dc.contributor.sponsor.spa.fl_str_mv |
Grupo de investigación y Desarrollo en Sistemas y Computación (GIDSYC) |
| dc.subject.proposal.spa.fl_str_mv |
Multimodal, Machine Learning, Data Fusion, Deep Learning. Multimodalidad, Aprendizaje de máquinas, Fusión de datos, Aprendizaje profundo. |
| topic |
Multimodal, Machine Learning, Data Fusion, Deep Learning. Multimodalidad, Aprendizaje de máquinas, Fusión de datos, Aprendizaje profundo. |
| description |
Multimodal Deep Learning (MMDL) has emerged as a potent framework for synthesizing information from diverse data sources, enhancing the capability of models to understand and predict complex phenomena. Particularly, Vision Transformers (ViT) have shown promising results in processing visual data alongside other modalities for comprehensive analysis. This study aims to investigate the integration of MMDL and ViT in the context of traffic accident detection, addressing the critical need for advanced predictive models in this domain. Through a literature review, we assess the current landscape of MMDL applications, and highlight the evolution and challenges of multimodal learning. Building on these insights, we propose a novel MMDL architecture designed to leverage video, audio, and metadata for accurate and timely accident detection. Our methodology combines a structured review of recent MMDL research with a theoretical approach to architecture design, emphasizing the fusion of multimodal data through ViT. The review adheres to established guidelines for systematic reviews, focusing on advancements from 2019 to 2023, while the architecture design is grounded in a thorough analysis of modalities relevant to traffic incidents. The main contributions include a taxonomy of MMDL methods and a ViT-based architecture for enhancing traffic safety systems. Integrating multimodal data through advanced deep learning models can improves the prediction accuracy of traffic accident detection. This research underscores the potential of MMDL and ViT in developing robust, real-time monitoring systems, marking a step forward in the application of artificial intelligence for public safety and smart city initiatives. |
| publishDate |
2024 |
| dc.date.accessioned.none.fl_str_mv |
2024-07-11T13:43:30Z |
| dc.date.available.none.fl_str_mv |
2024-07-11T13:43:30Z |
| dc.date.issued.none.fl_str_mv |
2024 |
| dc.date.submitted.none.fl_str_mv |
2024 |
| dc.type.spa.fl_str_mv |
bachelorThesis |
| dc.type.coar.fl_str_mv |
http://purl.org/coar/resource_type/c_7a1f |
| dc.type.coar.none.fl_str_mv |
https://vocabularies.coar-repositories.org/resource_types/c_7a1f/ |
| dc.type.driver.none.fl_str_mv |
info:eu-repo/semantics/bachelorThesis |
| dc.type.local.spa.fl_str_mv |
Trabajo de Grado de Pregrado |
| format |
https://vocabularies.coar-repositories.org/resource_types/c_7a1f/ |
| dc.identifier.uri.none.fl_str_mv |
https://repositorio.unimagdalena.edu.co/handle/123456789/21215 |
| url |
https://repositorio.unimagdalena.edu.co/handle/123456789/21215 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.rights.none.fl_str_mv |
Acceso Abierto info:eu-repo/semantics/openAccess |
| dc.rights.coar.fl_str_mv |
http://purl.org/coar/access_right/c_abf2 |
| dc.rights.cc.spa.fl_str_mv |
Acceso Abierto |
| dc.rights.accessrights.none.fl_str_mv |
info:eu-repo/semantics/openAccess |
| dc.rights.creativecommons.none.fl_str_mv |
https://creativecommons.org/licenses/by-nc-sa/4.0/ |
| dc.rights.creativecommons.spa.fl_str_mv |
atribucionnocomercialcompartir |
| rights_invalid_str_mv |
Acceso Abierto https://creativecommons.org/licenses/by-nc-sa/4.0/ atribucionnocomercialcompartir http://purl.org/coar/access_right/c_abf2 |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
text |
| dc.publisher.none.fl_str_mv |
Universidad del Magdalena |
| dc.publisher.spa.fl_str_mv |
Universidad del Magdalena |
| dc.publisher.department.spa.fl_str_mv |
Facultad de Ingeniería |
| dc.publisher.program.spa.fl_str_mv |
Ingeniería de Sistemas |
| dc.publisher.place.spa.fl_str_mv |
Santa Marta |
| publisher.none.fl_str_mv |
Universidad del Magdalena |
| institution |
Universidad del Magdalena |
| bitstream.url.fl_str_mv |
http://localhost:4000/bitstreams/22c6baa0-56cd-4bb7-a9cc-d52044820d9c/download http://localhost:4000/bitstreams/89856b75-2e7a-4d0e-aa14-9eb0039fb813/download http://localhost:4000/bitstreams/72e9e210-fbac-4e2d-9c39-e529cb3f88ff/download |
| bitstream.checksum.fl_str_mv |
6742e99f1bfa1457ec7e607813b4ddee 55a0e8f56af35c7d44385ed7d87efd81 03de826a7ba30b30f95ba9233c6ed790 |
| bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
| repository.name.fl_str_mv |
DSpace Started with Docker Compose |
| repository.mail.fl_str_mv |
dspace-help@myu.edu |
| _version_ |
1855544909342179328 |
| spelling |
Sánchez Torres, GermánHenriquez Miranda, Carlos NelsonRíos Pérez, Jesús DavidIngeniero (a) de SistemasGrupo de investigación y Desarrollo en Sistemas y Computación (GIDSYC)2024-07-11T13:43:30Z2024-07-11T13:43:30Z20242024https://repositorio.unimagdalena.edu.co/handle/123456789/21215Multimodal Deep Learning (MMDL) has emerged as a potent framework for synthesizing information from diverse data sources, enhancing the capability of models to understand and predict complex phenomena. Particularly, Vision Transformers (ViT) have shown promising results in processing visual data alongside other modalities for comprehensive analysis. This study aims to investigate the integration of MMDL and ViT in the context of traffic accident detection, addressing the critical need for advanced predictive models in this domain. Through a literature review, we assess the current landscape of MMDL applications, and highlight the evolution and challenges of multimodal learning. Building on these insights, we propose a novel MMDL architecture designed to leverage video, audio, and metadata for accurate and timely accident detection. Our methodology combines a structured review of recent MMDL research with a theoretical approach to architecture design, emphasizing the fusion of multimodal data through ViT. The review adheres to established guidelines for systematic reviews, focusing on advancements from 2019 to 2023, while the architecture design is grounded in a thorough analysis of modalities relevant to traffic incidents. The main contributions include a taxonomy of MMDL methods and a ViT-based architecture for enhancing traffic safety systems. Integrating multimodal data through advanced deep learning models can improves the prediction accuracy of traffic accident detection. This research underscores the potential of MMDL and ViT in developing robust, real-time monitoring systems, marking a step forward in the application of artificial intelligence for public safety and smart city initiatives.textUniversidad del MagdalenaUniversidad del MagdalenaFacultad de IngenieríaIngeniería de SistemasSanta MartaAcceso Abiertoinfo:eu-repo/semantics/openAccessAcceso Abiertoinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/4.0/atribucionnocomercialcompartirhttp://purl.org/coar/access_right/c_abf2Exploration of a ViT-based multimodal approach to Vehicle Accident DetectionExploración de un enfoque multimodal basado en ViT para la Detección de Accidentes VehicularesbachelorThesishttps://vocabularies.coar-repositories.org/resource_types/c_7a1f/http://purl.org/coar/resource_type/c_7a1finfo:eu-repo/semantics/bachelorThesisTrabajo de Grado de PregradoMultimodal, Machine Learning, Data Fusion, Deep Learning.Multimodalidad, Aprendizaje de máquinas, Fusión de datos, Aprendizaje profundo.engPregradoORIGINALExploration of a ViT-based multimodal approach.pdfExploration of a ViT-based multimodal approach.pdfMultimodal Deep Learning (MMDL) has emerged as a potent framework for synthesizing information from diverse data sources, enhancing the capability of models to understand and predict complex phenomena. Particularly, Vision Transformers (ViT) have shown promising results in processing visual data alongside other modalities for comprehensive analysis. This study aims to investigate the integration of MMDL and ViT in the context of traffic accident detection, addressing the critical need for advanced predictive models in this domain. Through a literature review, we assess the current landscape of MMDL applications, and highlight the evolution and challenges of multimodal learning. Building on these insights, we propose a novel MMDL architecture designed to leverage video, audio, and metadata for accurate and timely accident detection. Our methodology combines a structured review of recent MMDL research with a theoretical approach to architecture design, emphasizing the fusion of multimodal data through ViT. The review adheres to established guidelines for systematic reviews, focusing on advancements from 2019 to 2023, while the architecture design is grounded in a thorough analysis of modalities relevant to traffic incidents. The main contributions include a taxonomy of MMDL methods and a ViT-based architecture for enhancing traffic safety systems. Integrating multimodal data through advanced deep learning models can improves the prediction accuracy of traffic accident detection. This research underscores the potential of MMDL and ViT in developing robust, real-time monitoring systems, marking a step forward in the application of artificial intelligence for public safety and smart city initiatives.application/pdf1872021http://localhost:4000/bitstreams/22c6baa0-56cd-4bb7-a9cc-d52044820d9c/download6742e99f1bfa1457ec7e607813b4ddeeMD51trueAnonymousREADBI_F12_Formato_Licencia_Publicacion_Trabajos_Grado jesus.pdfBI_F12_Formato_Licencia_Publicacion_Trabajos_Grado jesus.pdfapplication/pdf549657http://localhost:4000/bitstreams/89856b75-2e7a-4d0e-aa14-9eb0039fb813/download55a0e8f56af35c7d44385ed7d87efd81MD53falseAnonymousREADLICENSElicense.txtlicense.txttext/plain; charset=utf-82484http://localhost:4000/bitstreams/72e9e210-fbac-4e2d-9c39-e529cb3f88ff/download03de826a7ba30b30f95ba9233c6ed790MD52falseAnonymousREAD123456789/21215oai:localhost:123456789/212152024-11-22 07:27:57.468open.accesshttp://localhost:4000DSpace Started with Docker Composedspace-help@myu.edu |
