Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural
ilustraciones, graficas
- Autores:
-
López Solano, Juan Camilo
- Tipo de recurso:
- Fecha de publicación:
- 2022
- Institución:
- Universidad Nacional de Colombia
- Repositorio:
- Universidad Nacional de Colombia
- Idioma:
- spa
- OAI Identifier:
- oai:repositorio.unal.edu.co:unal/81747
- Palabra clave:
- 000 - Ciencias de la computación, información y obras generales::005 - Programación, programas, datos de computación
social engineering
ingeniería social
Cybersecurity
Social Engineering
Natural Language Processing
Machine Learning
Ciberseguridad
Ingeniería Social
Procesamiento de Lenguaje Natural
Aprendizaje de máquina
Modelo de simulación
Simulation models
- Rights
- openAccess
- License
- Reconocimiento 4.0 Internacional
id |
UNACIONAL2_d20dbaac9897a827720706f9ef234cde |
---|---|
oai_identifier_str |
oai:repositorio.unal.edu.co:unal/81747 |
network_acronym_str |
UNACIONAL2 |
network_name_str |
Universidad Nacional de Colombia |
repository_id_str |
|
dc.title.spa.fl_str_mv |
Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural |
dc.title.translated.eng.fl_str_mv |
Implementation of computational model for social engineering detection based on machine learning and natural language processing |
title |
Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural |
spellingShingle |
Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural 000 - Ciencias de la computación, información y obras generales::005 - Programación, programas, datos de computación social engineering ingeniería social Cybersecurity Social Engineering Natural Language Processing Machine Learning Ciberseguridad Ingeniería Social Procesamiento de Lenguaje Natural Aprendizaje de máquina Modelo de simulación Simulation models |
title_short |
Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural |
title_full |
Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural |
title_fullStr |
Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural |
title_full_unstemmed |
Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural |
title_sort |
Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural |
dc.creator.fl_str_mv |
López Solano, Juan Camilo |
dc.contributor.advisor.none.fl_str_mv |
Camargo Mendoza, Jorge Eliecer |
dc.contributor.author.none.fl_str_mv |
López Solano, Juan Camilo |
dc.contributor.researchgroup.spa.fl_str_mv |
Unsecurelab Cybersecurity Research Group |
dc.subject.ddc.spa.fl_str_mv |
000 - Ciencias de la computación, información y obras generales::005 - Programación, programas, datos de computación |
topic |
000 - Ciencias de la computación, información y obras generales::005 - Programación, programas, datos de computación social engineering ingeniería social Cybersecurity Social Engineering Natural Language Processing Machine Learning Ciberseguridad Ingeniería Social Procesamiento de Lenguaje Natural Aprendizaje de máquina Modelo de simulación Simulation models |
dc.subject.other.eng.fl_str_mv |
social engineering |
dc.subject.other.spa.fl_str_mv |
ingeniería social |
dc.subject.proposal.eng.fl_str_mv |
Cybersecurity Social Engineering Natural Language Processing Machine Learning |
dc.subject.proposal.spa.fl_str_mv |
Ciberseguridad Ingeniería Social Procesamiento de Lenguaje Natural Aprendizaje de máquina |
dc.subject.unesco.spa.fl_str_mv |
Modelo de simulación |
dc.subject.unesco.eng.fl_str_mv |
Simulation models |
description |
ilustraciones, graficas |
publishDate |
2022 |
dc.date.accessioned.none.fl_str_mv |
2022-07-25T20:26:15Z |
dc.date.available.none.fl_str_mv |
2022-07-25T20:26:15Z |
dc.date.issued.none.fl_str_mv |
2022 |
dc.type.spa.fl_str_mv |
Trabajo de grado - Maestría |
dc.type.driver.spa.fl_str_mv |
info:eu-repo/semantics/masterThesis |
dc.type.version.spa.fl_str_mv |
info:eu-repo/semantics/acceptedVersion |
dc.type.content.spa.fl_str_mv |
Text |
dc.type.redcol.spa.fl_str_mv |
http://purl.org/redcol/resource_type/TM |
status_str |
acceptedVersion |
dc.identifier.uri.none.fl_str_mv |
https://repositorio.unal.edu.co/handle/unal/81747 |
dc.identifier.instname.spa.fl_str_mv |
Universidad Nacional de Colombia |
dc.identifier.reponame.spa.fl_str_mv |
Repositorio Institucional Universidad Nacional de Colombia |
dc.identifier.repourl.spa.fl_str_mv |
https://repositorio.unal.edu.co/ |
url |
https://repositorio.unal.edu.co/handle/unal/81747 https://repositorio.unal.edu.co/ |
identifier_str_mv |
Universidad Nacional de Colombia Repositorio Institucional Universidad Nacional de Colombia |
dc.language.iso.spa.fl_str_mv |
spa |
language |
spa |
dc.relation.indexed.spa.fl_str_mv |
RedCol LaReferencia |
dc.relation.references.spa.fl_str_mv |
Amat, J. (Abril 2017). Máquinas de Vector Soporte (Support Vector Machines, SVMs) https://www.cienciadedatos.net/documentos/34_maquinas_de_vector_soporte_support_vector_machines Balim, C., & Gunal, E. S. (Noviembre 2019). Automatic Detection of Smishing Attacks by Machine Learning Methods. In 2019 1st International Informatics and Software Engineering Conference (UBMYK) (pp. 1-3). IEEE. Bezuidenhout, M., Mouton, F., & Venter, H. S. (2010). Social engineering attack detection model: SEADM. Proceedings of the 2010 Information Security for South Africa Conference, ISSA 2010. Bhakta, R., & Harris, I. G. (2015). Semantic analysis of dialogs to detect social engineering attacks. Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015). Bhardwaj, T., Sharma, T. K., & Pandit, M. R. (2014). Social engineering prevention by detecting malicious URLs using artificial bee colony algorithm. 355-363. Bueno, F. (2019). Redes neuronales: entrenamiento y comportamiento. Cialdini, Robert. (1993). Influence: Science and Practice. Coulombe, C. (2018). Text data augmentation made simple by leveraging nlp cloud apis. arXiv preprint arXiv:1812.04718. Craigen, D., Diakun-Thibault, N., & Purse, R. (2014). Defining cybersecurity. Technology Innovation Management Review, 4(10). Dan, A., & Gupta, S. (2019). Social engineering attack detection and data protection model (SEADDPM). In Advances in Intelligent Systems and Computing (Vol. 811, pp. 15-24). https://doi.org/10.1007/978-981- 13-1544-2 Del Pozo, I. (2018). Social engineering: Application of psychology to information security. 2018 6th International Conference on Future Internet of Things and Cloud Workshops Denning, T., Lerner, A., Shostack, A., & Kohno, T. (2013, November). Control-Alt-Hack: the design and evaluation of a card game for computer security awareness and education. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security (pp. 915-928). Feng, S. Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T., & Hovy, E. (2021). A survey of data augmentation approaches for nlp. arXiv preprint arXiv:2105.03075. Footprint (2021). 2021 State of the Phish. An In-Depth Look at User Awareness, Vulnerability and Resilience. https://www.proofpoint.com/sites/default/files/threat-reports/pfpt-us-tr-state-of-the-phish-2021.pdf Gatlan, S. (3 de septiembre de 2020). FBI: Thousands of orgs targeted by RDoS extortion campaign. BleepingComputer. https://www.bleepingcomputer.com/news/security/fbi-thousands-of-orgs-targeted-by-rdos-extortion-campaign/ Googletrans (11 de enero 2022). Googletrans 3.0.0 documentation. https://py-googletrans.readthedocs.io/en/latest/ Gragg, D. (2003). A multi-level defense against social engineering. SANS Reading Room, 13, 15. Gregar, J. (1994). Research Design (Qualitative, Quantitative and Mixed Methods Approaches). Book published by SAGE Publications, 228. Hadnagy, C. (2010). Social Engineering: The Art of Human Hacking. Hernández-Sampieri, R., & Torres, C. P. M. (2018). Metodología de la investigación (Vol. 4). México^ eD. F DF: McGraw-Hill Interamericana. Infoblox. (2020). Cyberthreat Intelligence Report. The Infloblo Q3 2020. Ivaturi, K., & Janczewski, L. (Junio 2011). A taxonomy for social engineering attacks. In International Conference on Information Resources Management (pp. 1-12). Centre for Information Technology, Organizations, and People. Janczewski, L., & Colarik, A. (Eds.). (2007). Cyber warfare and cyber terrorism. IGI Global. Junger, M., Montoya, L., & Overink, F. J. (2017). Priming and warnings are not effective to prevent social engineering attacks. Computers in human behavior, 66, 75-87. Kaspersky. (2022). ¿Qué es la ciberseguridad?. Recuperado el 02 de enero de 2022 de https://latam.kaspersky.com/resource-center/definitions/what-is-cyber-security Khonji, M., Iraqi, Y., & Jones, A. (2013). Phishing detection: a literature survey. IEEE Communications Surveys & Tutorials, 15(4), 2091-2121. Khorshed, M. T., Ali, A. S., & Wasimi, S. A. (2014). Combating Cyber Attacks in Cloud Systems Using Machine Learning. In Security, Privacy and Trust in Cloud Systems (pp. 407-431). Springer, Berlin, Heidelberg. Krombholz, K., Hobel, H., Huber, M., & Weippl, E. (2015). Advanced social engineering attacks. Journal of Information Security and applications, 22, 113-122. Lansley, M., Mouton, F., Kapetanakis, S., & Polatidis, N. (2020). SEADer++: social engineering attack detection in online environments using machine learning. Journal of Information and Telecommunication, 4(3), 346-362. Lansley, M., Polatidis, N., & Kapetanakis, S. (Septiembre 2019). Seader: A social engineering attack detection method based on natural language processing and artificial neural networks. In International Conference on Computational Collective Intelligence (pp. 686-696). Springer, Cham. Lansley, M., Polatidis, N., Kapetanakis, S., Amin, K., Samakovitis, G., & Petridis, M. (2019). Seen the villains: Detecting Social Engineering Attacks using Case-based Reasoning and Deep Learning. In ICCBR Workshops (pp. 39-48). Long, J. (2011). No tech hacking: A guide to social engineering, dumpster diving, and shoulder surfing. Syngress. Malwarefox. (2021). How to Spot Fake Facebook Profile. https://www.malwarefox.com/spot-fake-facebook-profile/ Matplotlib. (14 de enero 2022). Matplotlib: Visualization with Python. https://matplotlib.org López, J., & Camargo, J., (Para ser presentada en Marzo 2022). Social Engineering Detection Using Natural Language Processing and Machine Learning.The 5th International Conference on Information and Computer Technologies (ICICT), 2022. Merino, R. F. M., & Chacón, C. I. Ñ. (2017). Bosques aleatorios como extensión de los árboles de clasificación con los programas R y Python. Interfases, (10), 165-189. Mitnick, K. D., & Simon, W. L. (2003). The art of deception: Controlling the human element of security. John Wiley & Sons. Mokhor, V. V, Tsurkan, O. V, Tsurkan, V. V, & Herasymov, R. P. (2017). Information security assessment of computer systems by socio-engineering approach. CEUR Workshop Proceedings, 2067, 92-98. Mouton, F., Leenen, L., & Venter, H. S. (2016). Social Engineering Attack Detection Model: SEADMv2. Proceedings - 2015 International Conference on Cyberworlds, CW 2015, 216-223. Mouton, F., Malan, M. M., Leenen, L., & Venter, H. S. (Agosto 2014). Social engineering attack framework. In 2014 Information Security for South Africa (pp. 1-9). IEEE. Mouton, F., Teixeira, M., & Meyer, T. (Agosto 2017). Benchmarking a mobile implementation of the social engineering prevention training tool. In 2017 Information Security for South Africa (ISSA) (pp. 106-116). IEEE. NLTK. (14 de enero 2022). Documentation - Natural Language Toolkit. https://www.nltk.org Numpy. (14 de enero 2022). Numpy documentation. https://numpy.org/doc/stable/ Olabe, X. B. (1998). Redes neuronales artificiales y sus aplicaciones. Publicaciones de la Escuela de Ingenieros. OWASP. (2021). OWASP Top 10 - 2021. https://owasp.org/Top10/ Python. (14 de enero 2022). History and License. https://docs.python.org/3/license.html Python. (14 de enero 2022). os — Interfaces misceláneas del sistema operativo. https://docs.python.org/es/3.9/library/os.html?highlight=#module-os Python. (14 de enero 2022). re — Operaciones con expresiones regulares. https://docs.python.org/es/3.9/library/re.html Python. (14 de enero 2022). time — Tiempo de acceso y conversiones. https://docs.python.org/es/3.9/library/time.html?highlight=time#module-time Sahingoz, O. K., Buber, E., Demir, O., & Diri, B. (2019). Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345-357. Sawa, Y., Bhakta, R., Harris, I. G., & Hadnagy, C. (2016). Detection of Social Engineering Attacks Through Natural Language Processing of Conversations. Proceedings - 2016 IEEE 10th International Conference on Semantic Computing, ICSC 2016, 262�����265. https://doi.org/10.1109/ICSC.2016.95 Scikit-Learn (14 de enero 2022). Inicio - scikit-learn - Machine Learning in Python. https://scikit-learn.org/dev/index.html Scikit-Learn (15 de enero 2022). 4.2. Permutation feature importance. https://scikit-learn.org/stable/modules/permutation_importance.html Shorten, C., Khoshgoftaar, T. M., & Furht, B. (2021). Text data augmentation for deep learning. Journal of big Data, 8(1), 1-34. Simmons, M., & Lee, J. S. (Julio 2020). Catfishing: A Look into Online Dating and Impersonation. In International Conference on Human-Computer Interaction (pp. 349-358). Springer, Cham. SonicWall. (2021). SonicWall 2021 Cyber Threat Report. Spacy. (11 de enero 2022). Respositorio de código en Github de Spacy. https://github.com/explosion/spaCy Srivalli, & Prasanna, L. (2019). Cyber attacks. International Journal of Engineering and Advanced Technology, 8(6 Special Issue 3), 1934-1936. https://doi.org/10.35940/ijeat.F1372.0986S319 Stajano, F., & Wilson, P. (2011). Understanding scam victims: Seven principles for systems security. Communications of the ACM, 54(3), 70-75. https://doi.org/10.1145/1897852.1897872 Stajano, F., & Wilson, P. (2011). Understanding scam victims: seven principles for systems security. Communications of the ACM, 54(3), 70-75. The Python Package Index. (14 de enero 2022). Powerful data structures for data analysis, time series, and statistics - Pandas. https://pypi.org/project/pandas/ The Python Package Index. (14 de enero 2022). Pure python spell checker based on work by Peter Norvig - Pyspellchecker. https://pypi.org/project/pyspellchecker/ The Python Package Index. (14 de enero 2022). Python HTTP for Humans - Requests. https://pypi.org/project/requests/ TIBOE. (14 de enero 2022). TIOBE Index for January 2022. https://www.tiobe.com/tiobe-index/ Tweepy. (08 de enero 2022). Tweepy. https://www.tweepy.org Wirth, R., & Hipp, J. (Abril 2000). CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining (Vol. 1, pp. 29-39). London, UK: Springer-Verlag. |
dc.rights.coar.fl_str_mv |
http://purl.org/coar/access_right/c_abf2 |
dc.rights.license.spa.fl_str_mv |
Reconocimiento 4.0 Internacional |
dc.rights.uri.spa.fl_str_mv |
http://creativecommons.org/licenses/by/4.0/ |
dc.rights.accessrights.spa.fl_str_mv |
info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
Reconocimiento 4.0 Internacional http://creativecommons.org/licenses/by/4.0/ http://purl.org/coar/access_right/c_abf2 |
eu_rights_str_mv |
openAccess |
dc.format.extent.spa.fl_str_mv |
xiv, 79 páginas |
dc.format.mimetype.spa.fl_str_mv |
application/pdf |
dc.publisher.spa.fl_str_mv |
Universidad Nacional de Colombia |
dc.publisher.program.spa.fl_str_mv |
Bogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computación |
dc.publisher.department.spa.fl_str_mv |
Departamento de Ingeniería de Sistemas e Industrial |
dc.publisher.faculty.spa.fl_str_mv |
Facultad de Ingeniería |
dc.publisher.place.spa.fl_str_mv |
Bogotá, Colombia |
dc.publisher.branch.spa.fl_str_mv |
Universidad Nacional de Colombia - Sede Bogotá |
institution |
Universidad Nacional de Colombia |
bitstream.url.fl_str_mv |
https://repositorio.unal.edu.co/bitstream/unal/81747/1/1020798860.2022.pdf https://repositorio.unal.edu.co/bitstream/unal/81747/2/license.txt https://repositorio.unal.edu.co/bitstream/unal/81747/3/1020798860.2022.pdf.jpg |
bitstream.checksum.fl_str_mv |
90803b20502c98beeb8d0fccb14751a5 8153f7789df02f0a4c9e079953658ab2 242c794c175a5cb5219b0b66d69e1e33 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositorio Institucional Universidad Nacional de Colombia |
repository.mail.fl_str_mv |
repositorio_nal@unal.edu.co |
_version_ |
1814089573378031616 |
spelling |
Reconocimiento 4.0 Internacionalhttp://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Camargo Mendoza, Jorge Eliecer5348a4327d4ddf28ddd4bd4b01fcbff6López Solano, Juan Camilof454db5d036d7774e9800bacb9f55550Unsecurelab Cybersecurity Research Group2022-07-25T20:26:15Z2022-07-25T20:26:15Z2022https://repositorio.unal.edu.co/handle/unal/81747Universidad Nacional de ColombiaRepositorio Institucional Universidad Nacional de Colombiahttps://repositorio.unal.edu.co/ilustraciones, graficasLa seguridad informática o ciberseguridad se encarga de la protección de datos y servicios ante individuos no autorizados y protege las características de la información como la integridad, la confidencialidad y la disponibilidad. Existen múltiples amenazas y ataques que ponen en riesgo la seguridad informática como el ransomware, el malware o programas malignos, los ataques de denegación de servicios, las fallas de inyección, la ingeniería social, entre otros. En muchas ocasiones la parte más vulnerable de los sistemas son los usuarios, por este motivo los ciberdelincuentes usan la ingeniería social para adquirir información de forma ilícita de los usuarios. La ingeniería social consiste en la manipulación de los individuos mediante el engaño para que divulguen información privada o confidencial. Este tipo de ciberataque es muy difícil de detectar ya que puede ser ejecutado por cualquier individuo en cualquier momento y explota aspectos psicológicos de los humanos para engañarlos. En el presente trabajo se presenta la implementación de un modelo computacional basado en técnicas de Procesamiento de Lenguaje Natural para extraer características en textos y alimentar tres algoritmos de Aprendizaje de Máquina (redes neuronales, máquinas de vector de soporte y bosques aleatorios) para detectar posibles ataques de ingeniería social en textos. Los tres algoritmos fueron entrenados y evaluados, mostrando resultados que superan el 80% de exactitud en la detección de ataques de ingeniería social. (Texto tomado de la fuente)Computer security or cybersecurity is responsible for the protection of data and services against unauthorized people and protects information characteristics such as integrity, confidentiality, and availability. There are multiple threats and attacks that put computer security at risk such as ransomware, malware, denial of services attacks, injection failures, social engineering, among others. In many cases, the most vulnerable part of systems are users, for this reason cybercriminals use social engineering to illegally acquire information from users. Social engineering consists of the manipulation of people through deception to make them disclose private or confidential information. This type of cyber-attack is very difficult to detect since it can be executed by any individual at any time and exploits psychological aspects of humans to deceive them. This paper presents the implementation of a computational model based on Natural Language Processing techniques to extract characteristics in texts and used to train three Machine Learning algorithms (Neural Network, Support Vector Machine and Random Forest) to detect possible social engineering attacks in texts. The three algorithms were trained and tested showing an accuracy over 80% in the task of detecting social engineering attacks.MaestríaMagíster en Ingeniería - Ingeniería de Sistemas y ComputaciónSistemas inteligentesxiv, 79 páginasapplication/pdfspaUniversidad Nacional de ColombiaBogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y ComputaciónDepartamento de Ingeniería de Sistemas e IndustrialFacultad de IngenieríaBogotá, ColombiaUniversidad Nacional de Colombia - Sede Bogotá000 - Ciencias de la computación, información y obras generales::005 - Programación, programas, datos de computaciónsocial engineeringingeniería socialCybersecuritySocial EngineeringNatural Language ProcessingMachine LearningCiberseguridadIngeniería SocialProcesamiento de Lenguaje NaturalAprendizaje de máquinaModelo de simulaciónSimulation modelsImplementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje naturalImplementation of computational model for social engineering detection based on machine learning and natural language processingTrabajo de grado - Maestríainfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/acceptedVersionTexthttp://purl.org/redcol/resource_type/TMRedColLaReferenciaAmat, J. (Abril 2017). Máquinas de Vector Soporte (Support Vector Machines, SVMs) https://www.cienciadedatos.net/documentos/34_maquinas_de_vector_soporte_support_vector_machinesBalim, C., & Gunal, E. S. (Noviembre 2019). Automatic Detection of Smishing Attacks by Machine Learning Methods. In 2019 1st International Informatics and Software Engineering Conference (UBMYK) (pp. 1-3). IEEE.Bezuidenhout, M., Mouton, F., & Venter, H. S. (2010). Social engineering attack detection model: SEADM. Proceedings of the 2010 Information Security for South Africa Conference, ISSA 2010.Bhakta, R., & Harris, I. G. (2015). Semantic analysis of dialogs to detect social engineering attacks. Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).Bhardwaj, T., Sharma, T. K., & Pandit, M. R. (2014). Social engineering prevention by detecting malicious URLs using artificial bee colony algorithm. 355-363.Bueno, F. (2019). Redes neuronales: entrenamiento y comportamiento.Cialdini, Robert. (1993). Influence: Science and Practice.Coulombe, C. (2018). Text data augmentation made simple by leveraging nlp cloud apis. arXiv preprint arXiv:1812.04718.Craigen, D., Diakun-Thibault, N., & Purse, R. (2014). Defining cybersecurity. Technology Innovation Management Review, 4(10).Dan, A., & Gupta, S. (2019). Social engineering attack detection and data protection model (SEADDPM). In Advances in Intelligent Systems and Computing (Vol. 811, pp. 15-24). https://doi.org/10.1007/978-981- 13-1544-2Del Pozo, I. (2018). Social engineering: Application of psychology to information security. 2018 6th International Conference on Future Internet of Things and Cloud WorkshopsDenning, T., Lerner, A., Shostack, A., & Kohno, T. (2013, November). Control-Alt-Hack: the design and evaluation of a card game for computer security awareness and education. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security (pp. 915-928).Feng, S. Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T., & Hovy, E. (2021). A survey of data augmentation approaches for nlp. arXiv preprint arXiv:2105.03075.Footprint (2021). 2021 State of the Phish. An In-Depth Look at User Awareness, Vulnerability and Resilience. https://www.proofpoint.com/sites/default/files/threat-reports/pfpt-us-tr-state-of-the-phish-2021.pdfGatlan, S. (3 de septiembre de 2020). FBI: Thousands of orgs targeted by RDoS extortion campaign. BleepingComputer. https://www.bleepingcomputer.com/news/security/fbi-thousands-of-orgs-targeted-by-rdos-extortion-campaign/Googletrans (11 de enero 2022). Googletrans 3.0.0 documentation. https://py-googletrans.readthedocs.io/en/latest/Gragg, D. (2003). A multi-level defense against social engineering. SANS Reading Room, 13, 15.Gregar, J. (1994). Research Design (Qualitative, Quantitative and Mixed Methods Approaches). Book published by SAGE Publications, 228.Hadnagy, C. (2010). Social Engineering: The Art of Human Hacking.Hernández-Sampieri, R., & Torres, C. P. M. (2018). Metodología de la investigación (Vol. 4). México^ eD. F DF: McGraw-Hill Interamericana.Infoblox. (2020). Cyberthreat Intelligence Report. The Infloblo Q3 2020.Ivaturi, K., & Janczewski, L. (Junio 2011). A taxonomy for social engineering attacks. In International Conference on Information Resources Management (pp. 1-12). Centre for Information Technology, Organizations, and People.Janczewski, L., & Colarik, A. (Eds.). (2007). Cyber warfare and cyber terrorism. IGI Global.Junger, M., Montoya, L., & Overink, F. J. (2017). Priming and warnings are not effective to prevent social engineering attacks. Computers in human behavior, 66, 75-87.Kaspersky. (2022). ¿Qué es la ciberseguridad?. Recuperado el 02 de enero de 2022 de https://latam.kaspersky.com/resource-center/definitions/what-is-cyber-securityKhonji, M., Iraqi, Y., & Jones, A. (2013). Phishing detection: a literature survey. IEEE Communications Surveys & Tutorials, 15(4), 2091-2121.Khorshed, M. T., Ali, A. S., & Wasimi, S. A. (2014). Combating Cyber Attacks in Cloud Systems Using Machine Learning. In Security, Privacy and Trust in Cloud Systems (pp. 407-431). Springer, Berlin, Heidelberg.Krombholz, K., Hobel, H., Huber, M., & Weippl, E. (2015). Advanced social engineering attacks. Journal of Information Security and applications, 22, 113-122.Lansley, M., Mouton, F., Kapetanakis, S., & Polatidis, N. (2020). SEADer++: social engineering attack detection in online environments using machine learning. Journal of Information and Telecommunication, 4(3), 346-362.Lansley, M., Polatidis, N., & Kapetanakis, S. (Septiembre 2019). Seader: A social engineering attack detection method based on natural language processing and artificial neural networks. In International Conference on Computational Collective Intelligence (pp. 686-696). Springer, Cham.Lansley, M., Polatidis, N., Kapetanakis, S., Amin, K., Samakovitis, G., & Petridis, M. (2019). Seen the villains: Detecting Social Engineering Attacks using Case-based Reasoning and Deep Learning. In ICCBR Workshops (pp. 39-48).Long, J. (2011). No tech hacking: A guide to social engineering, dumpster diving, and shoulder surfing. Syngress.Malwarefox. (2021). How to Spot Fake Facebook Profile. https://www.malwarefox.com/spot-fake-facebook-profile/Matplotlib. (14 de enero 2022). Matplotlib: Visualization with Python. https://matplotlib.orgLópez, J., & Camargo, J., (Para ser presentada en Marzo 2022). Social Engineering Detection Using Natural Language Processing and Machine Learning.The 5th International Conference on Information and Computer Technologies (ICICT), 2022.Merino, R. F. M., & Chacón, C. I. Ñ. (2017). Bosques aleatorios como extensión de los árboles de clasificación con los programas R y Python. Interfases, (10), 165-189.Mitnick, K. D., & Simon, W. L. (2003). The art of deception: Controlling the human element of security. John Wiley & Sons.Mokhor, V. V, Tsurkan, O. V, Tsurkan, V. V, & Herasymov, R. P. (2017). Information security assessment of computer systems by socio-engineering approach. CEUR Workshop Proceedings, 2067, 92-98.Mouton, F., Leenen, L., & Venter, H. S. (2016). Social Engineering Attack Detection Model: SEADMv2. Proceedings - 2015 International Conference on Cyberworlds, CW 2015, 216-223.Mouton, F., Malan, M. M., Leenen, L., & Venter, H. S. (Agosto 2014). Social engineering attack framework. In 2014 Information Security for South Africa (pp. 1-9). IEEE.Mouton, F., Teixeira, M., & Meyer, T. (Agosto 2017). Benchmarking a mobile implementation of the social engineering prevention training tool. In 2017 Information Security for South Africa (ISSA) (pp. 106-116). IEEE.NLTK. (14 de enero 2022). Documentation - Natural Language Toolkit. https://www.nltk.orgNumpy. (14 de enero 2022). Numpy documentation. https://numpy.org/doc/stable/Olabe, X. B. (1998). Redes neuronales artificiales y sus aplicaciones. Publicaciones de la Escuela de Ingenieros.OWASP. (2021). OWASP Top 10 - 2021. https://owasp.org/Top10/Python. (14 de enero 2022). History and License. https://docs.python.org/3/license.htmlPython. (14 de enero 2022). os — Interfaces misceláneas del sistema operativo. https://docs.python.org/es/3.9/library/os.html?highlight=#module-osPython. (14 de enero 2022). re — Operaciones con expresiones regulares. https://docs.python.org/es/3.9/library/re.htmlPython. (14 de enero 2022). time — Tiempo de acceso y conversiones. https://docs.python.org/es/3.9/library/time.html?highlight=time#module-timeSahingoz, O. K., Buber, E., Demir, O., & Diri, B. (2019). Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345-357.Sawa, Y., Bhakta, R., Harris, I. G., & Hadnagy, C. (2016). Detection of Social Engineering Attacks Through Natural Language Processing of Conversations. Proceedings - 2016 IEEE 10th International Conference on Semantic Computing, ICSC 2016, 262�����265. https://doi.org/10.1109/ICSC.2016.95Scikit-Learn (14 de enero 2022). Inicio - scikit-learn - Machine Learning in Python. https://scikit-learn.org/dev/index.htmlScikit-Learn (15 de enero 2022). 4.2. Permutation feature importance. https://scikit-learn.org/stable/modules/permutation_importance.htmlShorten, C., Khoshgoftaar, T. M., & Furht, B. (2021). Text data augmentation for deep learning. Journal of big Data, 8(1), 1-34.Simmons, M., & Lee, J. S. (Julio 2020). Catfishing: A Look into Online Dating and Impersonation. In International Conference on Human-Computer Interaction (pp. 349-358). Springer, Cham.SonicWall. (2021). SonicWall 2021 Cyber Threat Report.Spacy. (11 de enero 2022). Respositorio de código en Github de Spacy. https://github.com/explosion/spaCySrivalli, & Prasanna, L. (2019). Cyber attacks. International Journal of Engineering and Advanced Technology, 8(6 Special Issue 3), 1934-1936. https://doi.org/10.35940/ijeat.F1372.0986S319Stajano, F., & Wilson, P. (2011). Understanding scam victims: Seven principles for systems security. Communications of the ACM, 54(3), 70-75. https://doi.org/10.1145/1897852.1897872Stajano, F., & Wilson, P. (2011). Understanding scam victims: seven principles for systems security. Communications of the ACM, 54(3), 70-75.The Python Package Index. (14 de enero 2022). Powerful data structures for data analysis, time series, and statistics - Pandas. https://pypi.org/project/pandas/The Python Package Index. (14 de enero 2022). Pure python spell checker based on work by Peter Norvig - Pyspellchecker. https://pypi.org/project/pyspellchecker/The Python Package Index. (14 de enero 2022). Python HTTP for Humans - Requests. https://pypi.org/project/requests/TIBOE. (14 de enero 2022). TIOBE Index for January 2022. https://www.tiobe.com/tiobe-index/Tweepy. (08 de enero 2022). Tweepy. https://www.tweepy.orgWirth, R., & Hipp, J. (Abril 2000). CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining (Vol. 1, pp. 29-39). London, UK: Springer-Verlag.EstudiantesInvestigadoresPersonal de apoyo escolarPúblico generalORIGINAL1020798860.2022.pdf1020798860.2022.pdfTesis de Maestría en Sistemas y Computaciónapplication/pdf1714829https://repositorio.unal.edu.co/bitstream/unal/81747/1/1020798860.2022.pdf90803b20502c98beeb8d0fccb14751a5MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-84074https://repositorio.unal.edu.co/bitstream/unal/81747/2/license.txt8153f7789df02f0a4c9e079953658ab2MD52THUMBNAIL1020798860.2022.pdf.jpg1020798860.2022.pdf.jpgGenerated Thumbnailimage/jpeg6122https://repositorio.unal.edu.co/bitstream/unal/81747/3/1020798860.2022.pdf.jpg242c794c175a5cb5219b0b66d69e1e33MD53unal/81747oai:repositorio.unal.edu.co:unal/817472024-08-05 23:10:48.379Repositorio Institucional Universidad Nacional de Colombiarepositorio_nal@unal.edu.coUExBTlRJTExBIERFUMOTU0lUTwoKQ29tbyBlZGl0b3IgZGUgZXN0ZSDDrXRlbSwgdXN0ZWQgcHVlZGUgbW92ZXJsbyBhIHJldmlzacOzbiBzaW4gYW50ZXMgcmVzb2x2ZXIgbG9zIHByb2JsZW1hcyBpZGVudGlmaWNhZG9zLCBkZSBsbyBjb250cmFyaW8sIGhhZ2EgY2xpYyBlbiBHdWFyZGFyIHBhcmEgZ3VhcmRhciBlbCDDrXRlbSB5IHNvbHVjaW9uYXIgZXN0b3MgcHJvYmxlbWFzIG1hcyB0YXJkZS4KClBhcmEgdHJhYmFqb3MgZGVwb3NpdGFkb3MgcG9yIHN1IHByb3BpbyBhdXRvcjoKIApBbCBhdXRvYXJjaGl2YXIgZXN0ZSBncnVwbyBkZSBhcmNoaXZvcyBkaWdpdGFsZXMgeSBzdXMgbWV0YWRhdG9zLCB5byBnYXJhbnRpem8gYWwgUmVwb3NpdG9yaW8gSW5zdGl0dWNpb25hbCBVbmFsIGVsIGRlcmVjaG8gYSBhbG1hY2VuYXJsb3MgeSBtYW50ZW5lcmxvcyBkaXNwb25pYmxlcyBlbiBsw61uZWEgZGUgbWFuZXJhIGdyYXR1aXRhLiBEZWNsYXJvIHF1ZSBsYSBvYnJhIGVzIGRlIG1pIHByb3BpZWRhZCBpbnRlbGVjdHVhbCB5IHF1ZSBlbCBSZXBvc2l0b3JpbyBJbnN0aXR1Y2lvbmFsIFVuYWwgbm8gYXN1bWUgbmluZ3VuYSByZXNwb25zYWJpbGlkYWQgc2kgaGF5IGFsZ3VuYSB2aW9sYWNpw7NuIGEgbG9zIGRlcmVjaG9zIGRlIGF1dG9yIGFsIGRpc3RyaWJ1aXIgZXN0b3MgYXJjaGl2b3MgeSBtZXRhZGF0b3MuIChTZSByZWNvbWllbmRhIGEgdG9kb3MgbG9zIGF1dG9yZXMgYSBpbmRpY2FyIHN1cyBkZXJlY2hvcyBkZSBhdXRvciBlbiBsYSBww6FnaW5hIGRlIHTDrXR1bG8gZGUgc3UgZG9jdW1lbnRvLikgRGUgbGEgbWlzbWEgbWFuZXJhLCBhY2VwdG8gbG9zIHTDqXJtaW5vcyBkZSBsYSBzaWd1aWVudGUgbGljZW5jaWE6IExvcyBhdXRvcmVzIG8gdGl0dWxhcmVzIGRlbCBkZXJlY2hvIGRlIGF1dG9yIGRlbCBwcmVzZW50ZSBkb2N1bWVudG8gY29uZmllcmVuIGEgbGEgVW5pdmVyc2lkYWQgTmFjaW9uYWwgZGUgQ29sb21iaWEgdW5hIGxpY2VuY2lhIG5vIGV4Y2x1c2l2YSwgbGltaXRhZGEgeSBncmF0dWl0YSBzb2JyZSBsYSBvYnJhIHF1ZSBzZSBpbnRlZ3JhIGVuIGVsIFJlcG9zaXRvcmlvIEluc3RpdHVjaW9uYWwsIHF1ZSBzZSBhanVzdGEgYSBsYXMgc2lndWllbnRlcyBjYXJhY3RlcsOtc3RpY2FzOiBhKSBFc3RhcsOhIHZpZ2VudGUgYSBwYXJ0aXIgZGUgbGEgZmVjaGEgZW4gcXVlIHNlIGluY2x1eWUgZW4gZWwgcmVwb3NpdG9yaW8sIHF1ZSBzZXLDoW4gcHJvcnJvZ2FibGVzIGluZGVmaW5pZGFtZW50ZSBwb3IgZWwgdGllbXBvIHF1ZSBkdXJlIGVsIGRlcmVjaG8gcGF0cmltb25pYWwgZGVsIGF1dG9yLiBFbCBhdXRvciBwb2Ryw6EgZGFyIHBvciB0ZXJtaW5hZGEgbGEgbGljZW5jaWEgc29saWNpdMOhbmRvbG8gYSBsYSBVbml2ZXJzaWRhZC4gYikgTG9zIGF1dG9yZXMgYXV0b3JpemFuIGEgbGEgVW5pdmVyc2lkYWQgTmFjaW9uYWwgZGUgQ29sb21iaWEgcGFyYSBwdWJsaWNhciBsYSBvYnJhIGVuIGVsIGZvcm1hdG8gcXVlIGVsIHJlcG9zaXRvcmlvIGxvIHJlcXVpZXJhIChpbXByZXNvLCBkaWdpdGFsLCBlbGVjdHLDs25pY28gbyBjdWFscXVpZXIgb3RybyBjb25vY2lkbyBvIHBvciBjb25vY2VyKSB5IGNvbm9jZW4gcXVlIGRhZG8gcXVlIHNlIHB1YmxpY2EgZW4gSW50ZXJuZXQgcG9yIGVzdGUgaGVjaG8gY2lyY3VsYSBjb24gYWxjYW5jZSBtdW5kaWFsLiBjKSBMb3MgYXV0b3JlcyBhY2VwdGFuIHF1ZSBsYSBhdXRvcml6YWNpw7NuIHNlIGhhY2UgYSB0w610dWxvIGdyYXR1aXRvLCBwb3IgbG8gdGFudG8sIHJlbnVuY2lhbiBhIHJlY2liaXIgZW1vbHVtZW50byBhbGd1bm8gcG9yIGxhIHB1YmxpY2FjacOzbiwgZGlzdHJpYnVjacOzbiwgY29tdW5pY2FjacOzbiBww7pibGljYSB5IGN1YWxxdWllciBvdHJvIHVzbyBxdWUgc2UgaGFnYSBlbiBsb3MgdMOpcm1pbm9zIGRlIGxhIHByZXNlbnRlIGxpY2VuY2lhIHkgZGUgbGEgbGljZW5jaWEgQ3JlYXRpdmUgQ29tbW9ucyBjb24gcXVlIHNlIHB1YmxpY2EuIGQpIExvcyBhdXRvcmVzIG1hbmlmaWVzdGFuIHF1ZSBzZSB0cmF0YSBkZSB1bmEgb2JyYSBvcmlnaW5hbCBzb2JyZSBsYSBxdWUgdGllbmVuIGxvcyBkZXJlY2hvcyBxdWUgYXV0b3JpemFuIHkgcXVlIHNvbiBlbGxvcyBxdWllbmVzIGFzdW1lbiB0b3RhbCByZXNwb25zYWJpbGlkYWQgcG9yIGVsIGNvbnRlbmlkbyBkZSBzdSBvYnJhIGFudGUgbGEgVW5pdmVyc2lkYWQgTmFjaW9uYWwgeSBhbnRlIHRlcmNlcm9zLiBFbiB0b2RvIGNhc28gbGEgVW5pdmVyc2lkYWQgTmFjaW9uYWwgZGUgQ29sb21iaWEgc2UgY29tcHJvbWV0ZSBhIGluZGljYXIgc2llbXByZSBsYSBhdXRvcsOtYSBpbmNsdXllbmRvIGVsIG5vbWJyZSBkZWwgYXV0b3IgeSBsYSBmZWNoYSBkZSBwdWJsaWNhY2nDs24uIGUpIExvcyBhdXRvcmVzIGF1dG9yaXphbiBhIGxhIFVuaXZlcnNpZGFkIHBhcmEgaW5jbHVpciBsYSBvYnJhIGVuIGxvcyBhZ3JlZ2Fkb3JlcywgaW5kaWNlc3MgeSBidXNjYWRvcmVzIHF1ZSBzZSBlc3RpbWVuIG5lY2VzYXJpb3MgcGFyYSBwcm9tb3ZlciBzdSBkaWZ1c2nDs24uIGYpIExvcyBhdXRvcmVzIGFjZXB0YW4gcXVlIGxhIFVuaXZlcnNpZGFkIE5hY2lvbmFsIGRlIENvbG9tYmlhIHB1ZWRhIGNvbnZlcnRpciBlbCBkb2N1bWVudG8gYSBjdWFscXVpZXIgbWVkaW8gbyBmb3JtYXRvIHBhcmEgcHJvcMOzc2l0b3MgZGUgcHJlc2VydmFjacOzbiBkaWdpdGFsLiBTSSBFTCBET0NVTUVOVE8gU0UgQkFTQSBFTiBVTiBUUkFCQUpPIFFVRSBIQSBTSURPIFBBVFJPQ0lOQURPIE8gQVBPWUFETyBQT1IgVU5BIEFHRU5DSUEgTyBVTkEgT1JHQU5JWkFDScOTTiwgQ09OIEVYQ0VQQ0nDk04gREUgTEEgVU5JVkVSU0lEQUQgTkFDSU9OQUwgREUgQ09MT01CSUEsIExPUyBBVVRPUkVTIEdBUkFOVElaQU4gUVVFIFNFIEhBIENVTVBMSURPIENPTiBMT1MgREVSRUNIT1MgWSBPQkxJR0FDSU9ORVMgUkVRVUVSSURPUyBQT1IgRUwgUkVTUEVDVElWTyBDT05UUkFUTyBPIEFDVUVSRE8uIAoKUGFyYSB0cmFiYWpvcyBkZXBvc2l0YWRvcyBwb3Igb3RyYXMgcGVyc29uYXMgZGlzdGludGFzIGEgc3UgYXV0b3I6IAoKRGVjbGFybyBxdWUgZWwgZ3J1cG8gZGUgYXJjaGl2b3MgZGlnaXRhbGVzIHkgbWV0YWRhdG9zIGFzb2NpYWRvcyBxdWUgZXN0b3kgYXJjaGl2YW5kbyBlbiBlbCBSZXBvc2l0b3JpbyBJbnN0aXR1Y2lvbmFsIFVOKSBlcyBkZSBkb21pbmlvIHDDumJsaWNvLiBTaSBubyBmdWVzZSBlbCBjYXNvLCBhY2VwdG8gdG9kYSBsYSByZXNwb25zYWJpbGlkYWQgcG9yIGN1YWxxdWllciBpbmZyYWNjacOzbiBkZSBkZXJlY2hvcyBkZSBhdXRvciBxdWUgY29ubGxldmUgbGEgZGlzdHJpYnVjacOzbiBkZSBlc3RvcyBhcmNoaXZvcyB5IG1ldGFkYXRvcy4KTk9UQTogU0kgTEEgVEVTSVMgQSBQVUJMSUNBUiBBRFFVSVJJw5MgQ09NUFJPTUlTT1MgREUgQ09ORklERU5DSUFMSURBRCBFTiBFTCBERVNBUlJPTExPIE8gUEFSVEVTIERFTCBET0NVTUVOVE8uIFNJR0EgTEEgRElSRUNUUklaIERFIExBIFJFU09MVUNJw5NOIDAyMyBERSAyMDE1LCBQT1IgTEEgQ1VBTCBTRSBFU1RBQkxFQ0UgRUwgUFJPQ0VESU1JRU5UTyBQQVJBIExBIFBVQkxJQ0FDScOTTiBERSBURVNJUyBERSBNQUVTVFLDjUEgWSBET0NUT1JBRE8gREUgTE9TIEVTVFVESUFOVEVTIERFIExBIFVOSVZFUlNJREFEIE5BQ0lPTkFMIERFIENPTE9NQklBIEVOIEVMIFJFUE9TSVRPUklPIElOU1RJVFVDSU9OQUwgVU4sIEVYUEVESURBIFBPUiBMQSBTRUNSRVRBUsONQSBHRU5FUkFMLiAqTEEgVEVTSVMgQSBQVUJMSUNBUiBERUJFIFNFUiBMQSBWRVJTScOTTiBGSU5BTCBBUFJPQkFEQS4gCgpBbCBoYWNlciBjbGljIGVuIGVsIHNpZ3VpZW50ZSBib3TDs24sIHVzdGVkIGluZGljYSBxdWUgZXN0w6EgZGUgYWN1ZXJkbyBjb24gZXN0b3MgdMOpcm1pbm9zLiBTaSB0aWVuZSBhbGd1bmEgZHVkYSBzb2JyZSBsYSBsaWNlbmNpYSwgcG9yIGZhdm9yLCBjb250YWN0ZSBjb24gZWwgYWRtaW5pc3RyYWRvciBkZWwgc2lzdGVtYS4KClVOSVZFUlNJREFEIE5BQ0lPTkFMIERFIENPTE9NQklBIC0gw5psdGltYSBtb2RpZmljYWNpw7NuIDE5LzEwLzIwMjEK |