FCTNLP: Fighting cyberterrorism with natural language processing

Las redes sociales son una rica fuente de datos y han sido utilizadas para promover u organizar ciberdelitos que afectan al mundo real. Por ello, las fuerzas del orden se interesan por la información crucial que puede obtenerse de estas fuentes. La cantidad de información y el lenguaje informal que...

Full description

Autores:

Tipo de recurso:

Fecha de publicación:: 2021

Institución:: Universidad del Rosario

Repositorio:: Repositorio EdocUR - U. Rosario

Idioma:: eng

id	EDOCUR2_22053906f36b1778f0f75990ce591818
oai_identifier_str	oai:repository.urosario.edu.co:10336/34736
network_acronym_str	EDOCUR2
network_name_str	Repositorio EdocUR - U. Rosario
repository_id_str
dc.title.es.fl_str_mv	FCTNLP: Fighting cyberterrorism with natural language processing
dc.title.TranslatedTitle.es.fl_str_mv	FCTNLP: Luchando contra el ciberterrorismo con procesamiento de lenguaje natural
title	FCTNLP: Fighting cyberterrorism with natural language processing
spellingShingle	FCTNLP: Fighting cyberterrorism with natural language processing OSINT NER Ciberterrorismo Procesamiento de Lenguaje Natural Similitud semántica Análisis de sentimientos Matemáticas Cyberterrorism OSINT NLP NER Natural Language Processing Sentiment Analysis Semantic Similarity
title_short	FCTNLP: Fighting cyberterrorism with natural language processing
title_full	FCTNLP: Fighting cyberterrorism with natural language processing
title_fullStr	FCTNLP: Fighting cyberterrorism with natural language processing
title_full_unstemmed	FCTNLP: Fighting cyberterrorism with natural language processing
title_sort	FCTNLP: Fighting cyberterrorism with natural language processing
dc.contributor.advisor.none.fl_str_mv	Díaz López, Daniel Orlando
dc.subject.es.fl_str_mv	OSINT NER Ciberterrorismo Procesamiento de Lenguaje Natural Similitud semántica Análisis de sentimientos
topic	OSINT NER Ciberterrorismo Procesamiento de Lenguaje Natural Similitud semántica Análisis de sentimientos Matemáticas Cyberterrorism OSINT NLP NER Natural Language Processing Sentiment Analysis Semantic Similarity
dc.subject.ddc.es.fl_str_mv	Matemáticas
dc.subject.keyword.es.fl_str_mv	Cyberterrorism OSINT NLP NER Natural Language Processing Sentiment Analysis Semantic Similarity
description	Las redes sociales son una rica fuente de datos y han sido utilizadas para promover u organizar ciberdelitos que afectan al mundo real. Por ello, las fuerzas del orden se interesan por la información crucial que puede obtenerse de estas fuentes. La cantidad de información y el lenguaje informal que se utiliza para difundir la información hace que el Procesamiento del Lenguaje Natural (PLN) sea una excelente herramienta para realizar análisis sobre las publicaciones en las redes sociales. Por ello, en esta propuesta se integra una arquitectura con tres modelos de PLN para proporcionar un análisis exhaustivo de fuentes abiertas como los medios sociales. Este análisis extrae entidades del texto, identifica clusters de usuarios y su respectiva polaridad, finalmente todos los resultados se relacionan en una base de datos gráfica. Esta arquitectura se puso a prueba utilizando datos de un escenario real para determinar su viabilidad.
publishDate	2021
dc.date.created.none.fl_str_mv	2021-11-26
dc.date.accessioned.none.fl_str_mv	2022-08-22T19:11:40Z
dc.date.available.none.fl_str_mv	2022-08-22T19:11:40Z
dc.type.es.fl_str_mv	bachelorThesis
dc.type.coar.fl_str_mv	http://purl.org/coar/resource_type/c_7a1f
dc.type.document.es.fl_str_mv	Trabajo de grado
dc.type.spa.es.fl_str_mv	Trabajo de grado
dc.identifier.doi.none.fl_str_mv	https://doi.org/10.48713/10336_34736
dc.identifier.uri.none.fl_str_mv	https://repository.urosario.edu.co/handle/10336/34736
url	https://doi.org/10.48713/10336_34736 https://repository.urosario.edu.co/handle/10336/34736
dc.language.iso.es.fl_str_mv	eng
language	eng
dc.rights.coar.fl_str_mv	http://purl.org/coar/access_right/c_abf2
dc.rights.acceso.es.fl_str_mv	Abierto (Texto Completo)
rights_invalid_str_mv	Abierto (Texto Completo) http://purl.org/coar/access_right/c_abf2
dc.format.extent.es.fl_str_mv	26 pp
dc.format.mimetype.es.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Universidad del Rosario
dc.publisher.department.none.fl_str_mv	Escuela de Ingeniería, Ciencia y Tecnología
dc.publisher.program.none.fl_str_mv	Programa de Matemáticas Aplicadas y Ciencias de la Computación - MACC
publisher.none.fl_str_mv	Universidad del Rosario
institution	Universidad del Rosario
dc.source.bibliographicCitation.es.fl_str_mv	Council of Europe. Explanatory Report to the Convention on Cybercrime. https://rm.coe.int/CoERMPublicCommonSearc hServices/DisplayDCTMContent?documentId=09000016800c ce5b. 2001. Akhilesh Chandra and Melissa J. Snowe. “A taxonomy of cybercrime: Theory and design”. In: International Journal of Accounting Information Systems 38 (2020). 2019 UW CISA Symposium, p. 100467. issn: 1467-0895. doi: 10.1016/j.acc inf.2020.100467. url: https://www.sciencedirect.com/sc ience/article/pii/S1467089520300348. Jo˜ao Rafael Gon¸calves Evangelista et al. “Systematic literature review to investigate the application of open source intelligence (osint) with artificial intelligence”. In: Journal of Applied Security Research (2020), pp. 1–25. Heather J Williams and Ilana Blum. Defining second generation open source intelligence (OSINT) for the defense enterprise. Tech. rep. RAND Corporation Santa Monica United States, 2018. Javier Pastor-Galindo et al. “The Not Yet Exploited Goldmine of OSINT: Opportunities, Open Challenges and Future Trends”. In: IEEE Access 8 (2020), pp. 10282–10304. doi: 10.1109/ACCESS.2020.2965257. A. Thomas. Natural Language Processing with Spark NLP: Learning to Understand Text at Scale. O’Reilly Media, 2020. isbn: 9781492047766. url: https://books.google.com.co/b ooks?id=sJw6zQEACAAJ. Leigh Clark et al. “What Makes a Good Conversation? Challenges in Designing Truly Conversational Agents”. In: New York, NY, USA: Association for Computing Machinery, 2019, 1–12. isbn: 9781450359702. url: 10.1145.3290605.3300705. Swati Kumari, Zia Saquib, and Sanjay Pawar. Machine Learning Approach for Text Classification in Cybercrime. 2018. doi: 10.1109/ICCUBEA.2018.8697442. C. S´anchez-Rebollo et al. “Detection of Jihadism in Social Networks Using Big Data Techniques Supported by Graphs and Fuzzy Clustering”. In: Hindawi 2019.1238780 (2019), p. 13. doi: 10.1155.2019.1238780 Ibrahim Aljarah et al. “Intelligent detection of hate speech in Arabic social network: A machine learning approach”. In: Journal of Information Science 47.4 (2021), pp. 483–501. doi: 10.1177/0165551520917651. eprint: https://doi.org/10.11 77/0165551520917651. url: https://doi.org/10.1177/0165 551520917651. Iv´an Castillo-Z´u˜niga et al. “Internet Data Analysis Methodology for Cyberterrorism Vocabulary Detection, Combining Techniques of Big Data Analytics, NLP and Semantic Web”. In: International Journal on Semantic Web and Information Systems 16 (Jan. 2020), pp. 69–86. doi: 10.4018/IJSWIS.20 20010104. C Oleji et al. “Big data Analitic of Boko Haram insurgency attacks menace in nigeria using DynamicK-reference clustering algorithm”. In: 7 (Apr. 2020), pp. 1099–1107. V. N. Uzel, E. Sara¸c E¸ssiz, and S. Ay¸se Ozel. “Using Fuzzy ¨ Sets for Detecting Cyber Terrorism and Extremism in the Text”. In: 2018 Innovations in Intelligent Systems and Applications Conference (ASYU). 2018, pp. 1–4. doi: 10.1109 /ASYU.2018.8554017. Sanjeev J Wagh, Manisha S Bhende, and Anuradha D Thakare. Fundamentals of Data Science. Chapman and Hall/CRC, 2021, p. 14.
dc.source.instname.none.fl_str_mv	instname:Universidad del Rosario
dc.source.reponame.none.fl_str_mv	reponame:Repositorio Institucional EdocUR
bitstream.url.fl_str_mv	https://repository.urosario.edu.co/bitstreams/a11db3ac-50ce-4855-982c-5b5b9ee1dd0b/download https://repository.urosario.edu.co/bitstreams/66d8ec5e-4a2b-45a2-95a7-a808c501a9b1/download https://repository.urosario.edu.co/bitstreams/33c03e4c-42f4-4ae5-a9ac-b23bcf5373d9/download https://repository.urosario.edu.co/bitstreams/c8036ac3-4e9f-4a00-b58b-28d6cf9c6d3a/download
bitstream.checksum.fl_str_mv	9a89f0963c2d666d4d72a92fa2bcd972 fab9d9ed61d64f6ac005dee3306ae77e 28156dec299671386880e223c484b537 f2b4a3325165039f9094e3685eaa5759
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio institucional EdocUR
repository.mail.fl_str_mv	edocur@urosario.edu.co
_version_	1837007366185287680
spelling	Díaz López, Daniel Orlando1061695713600Zapata Rozo, Andrés FelipeProfesional en Matemáticas Aplicadas y Ciencias de la ComputaciónPregradoFull time3e71ceeb-e70b-4f66-b11d-79c57e843d4c6002022-08-22T19:11:40Z2022-08-22T19:11:40Z2021-11-26Las redes sociales son una rica fuente de datos y han sido utilizadas para promover u organizar ciberdelitos que afectan al mundo real. Por ello, las fuerzas del orden se interesan por la información crucial que puede obtenerse de estas fuentes. La cantidad de información y el lenguaje informal que se utiliza para difundir la información hace que el Procesamiento del Lenguaje Natural (PLN) sea una excelente herramienta para realizar análisis sobre las publicaciones en las redes sociales. Por ello, en esta propuesta se integra una arquitectura con tres modelos de PLN para proporcionar un análisis exhaustivo de fuentes abiertas como los medios sociales. Este análisis extrae entidades del texto, identifica clusters de usuarios y su respectiva polaridad, finalmente todos los resultados se relacionan en una base de datos gráfica. Esta arquitectura se puso a prueba utilizando datos de un escenario real para determinar su viabilidad.The social networks are a rich source of data and have been used to promote or organize cybercrimes that affect the real world. Because of this, the law enforcement agency are interest in the crucial information that can be get on this sources. The amount of information and the informal language which is used to spread information makes the Natural Language Processing (NLP) and excellent tool to make analysis over post in social media. That is why, in this proposal an architecture with three NLP models are integrated to provide an exhaustive analysis from open sources like social media. This analysis extract entities from the text, identifies clusters of users and their respective polarity, finally all of the results are related in a graph database. This architecture was under test using data from a real scenario in order to determine their feasibility.26 ppapplication/pdfhttps://doi.org/10.48713/10336_34736https://repository.urosario.edu.co/handle/10336/34736engUniversidad del RosarioEscuela de Ingeniería, Ciencia y TecnologíaPrograma de Matemáticas Aplicadas y Ciencias de la Computación - MACCAbierto (Texto Completo)http://purl.org/coar/access_right/c_abf2Council of Europe. Explanatory Report to the Convention on Cybercrime. https://rm.coe.int/CoERMPublicCommonSearc hServices/DisplayDCTMContent?documentId=09000016800c ce5b. 2001.Akhilesh Chandra and Melissa J. Snowe. “A taxonomy of cybercrime: Theory and design”. In: International Journal of Accounting Information Systems 38 (2020). 2019 UW CISA Symposium, p. 100467. issn: 1467-0895. doi: 10.1016/j.acc inf.2020.100467. url: https://www.sciencedirect.com/sc ience/article/pii/S1467089520300348.Jo˜ao Rafael Gon¸calves Evangelista et al. “Systematic literature review to investigate the application of open source intelligence (osint) with artificial intelligence”. In: Journal of Applied Security Research (2020), pp. 1–25.Heather J Williams and Ilana Blum. Defining second generation open source intelligence (OSINT) for the defense enterprise. Tech. rep. RAND Corporation Santa Monica United States, 2018.Javier Pastor-Galindo et al. “The Not Yet Exploited Goldmine of OSINT: Opportunities, Open Challenges and Future Trends”. In: IEEE Access 8 (2020), pp. 10282–10304. doi: 10.1109/ACCESS.2020.2965257.A. Thomas. Natural Language Processing with Spark NLP: Learning to Understand Text at Scale. O’Reilly Media, 2020. isbn: 9781492047766. url: https://books.google.com.co/b ooks?id=sJw6zQEACAAJ.Leigh Clark et al. “What Makes a Good Conversation? Challenges in Designing Truly Conversational Agents”. In: New York, NY, USA: Association for Computing Machinery, 2019, 1–12. isbn: 9781450359702. url: 10.1145.3290605.3300705.Swati Kumari, Zia Saquib, and Sanjay Pawar. Machine Learning Approach for Text Classification in Cybercrime. 2018. doi: 10.1109/ICCUBEA.2018.8697442.C. S´anchez-Rebollo et al. “Detection of Jihadism in Social Networks Using Big Data Techniques Supported by Graphs and Fuzzy Clustering”. In: Hindawi 2019.1238780 (2019), p. 13. doi: 10.1155.2019.1238780Ibrahim Aljarah et al. “Intelligent detection of hate speech in Arabic social network: A machine learning approach”. In: Journal of Information Science 47.4 (2021), pp. 483–501. doi: 10.1177/0165551520917651. eprint: https://doi.org/10.11 77/0165551520917651. url: https://doi.org/10.1177/0165 551520917651.Iv´an Castillo-Z´u˜niga et al. “Internet Data Analysis Methodology for Cyberterrorism Vocabulary Detection, Combining Techniques of Big Data Analytics, NLP and Semantic Web”. In: International Journal on Semantic Web and Information Systems 16 (Jan. 2020), pp. 69–86. doi: 10.4018/IJSWIS.20 20010104.C Oleji et al. “Big data Analitic of Boko Haram insurgency attacks menace in nigeria using DynamicK-reference clustering algorithm”. In: 7 (Apr. 2020), pp. 1099–1107.V. N. Uzel, E. Sara¸c E¸ssiz, and S. Ay¸se Ozel. “Using Fuzzy ¨ Sets for Detecting Cyber Terrorism and Extremism in the Text”. In: 2018 Innovations in Intelligent Systems and Applications Conference (ASYU). 2018, pp. 1–4. doi: 10.1109 /ASYU.2018.8554017.Sanjeev J Wagh, Manisha S Bhende, and Anuradha D Thakare. Fundamentals of Data Science. Chapman and Hall/CRC, 2021, p. 14.instname:Universidad del Rosarioreponame:Repositorio Institucional EdocUROSINTNERCiberterrorismoProcesamiento de Lenguaje NaturalSimilitud semánticaAnálisis de sentimientosMatemáticas510600CyberterrorismOSINTNLPNERNatural Language ProcessingSentiment AnalysisSemantic SimilarityFCTNLP: Fighting cyberterrorism with natural language processingFCTNLP: Luchando contra el ciberterrorismo con procesamiento de lenguaje naturalbachelorThesisTrabajo de gradoTrabajo de gradohttp://purl.org/coar/resource_type/c_7a1fORIGINALZapataRozo-AndresFelipe-2021.pdfZapataRozo-AndresFelipe-2021.pdfapplication/pdf4171930https://repository.urosario.edu.co/bitstreams/a11db3ac-50ce-4855-982c-5b5b9ee1dd0b/download9a89f0963c2d666d4d72a92fa2bcd972MD51LICENSElicense.txtlicense.txttext/plain1475https://repository.urosario.edu.co/bitstreams/66d8ec5e-4a2b-45a2-95a7-a808c501a9b1/downloadfab9d9ed61d64f6ac005dee3306ae77eMD52TEXTZapataRozo-AndresFelipe-2021.pdf.txtZapataRozo-AndresFelipe-2021.pdf.txtExtracted texttext/plain70274https://repository.urosario.edu.co/bitstreams/33c03e4c-42f4-4ae5-a9ac-b23bcf5373d9/download28156dec299671386880e223c484b537MD53THUMBNAILZapataRozo-AndresFelipe-2021.pdf.jpgZapataRozo-AndresFelipe-2021.pdf.jpgGenerated Thumbnailimage/jpeg2956https://repository.urosario.edu.co/bitstreams/c8036ac3-4e9f-4a00-b58b-28d6cf9c6d3a/downloadf2b4a3325165039f9094e3685eaa5759MD5410336/34736oai:repository.urosario.edu.co:10336/347362022-08-31 07:52:15.88https://repository.urosario.edu.coRepositorio institucional EdocURedocur@urosario.edu.coRUwoTE9TKSBBVVRPUihFUyksIG1hbmlmaWVzdGEobWFuaWZlc3RhbW9zKSBxdWUgbGEgb2JyYSBvYmpldG8gZGUgbGEgcHJlc2VudGUgYXV0b3JpemFjacOzbiBlcyBvcmlnaW5hbCB5IGxhIHJlYWxpesOzIHNpbiB2aW9sYXIgbyB1c3VycGFyIGRlcmVjaG9zIGRlIGF1dG9yIGRlIHRlcmNlcm9zLCBwb3IgbG8gdGFudG8gbGEgb2JyYSBlcyBkZSBleGNsdXNpdmEgYXV0b3LDrWEgeSB0aWVuZSBsYSB0aXR1bGFyaWRhZCBzb2JyZSBsYSBtaXNtYS4gCgpQQVJHUkFGTzogRW4gY2FzbyBkZSBwcmVzZW50YXJzZSBjdWFscXVpZXIgcmVjbGFtYWNpw7NuIG8gYWNjacOzbiBwb3IgcGFydGUgZGUgdW4gdGVyY2VybyBlbiBjdWFudG8gYSBsb3MgZGVyZWNob3MgZGUgYXV0b3Igc29icmUgbGEgb2JyYSBlbiBjdWVzdGnDs24sIEVMIEFVVE9SLCBhc3VtaXLDoSB0b2RhIGxhIHJlc3BvbnNhYmlsaWRhZCwgeSBzYWxkcsOhIGVuIGRlZmVuc2EgZGUgbG9zIGRlcmVjaG9zIGFxdcOtIGF1dG9yaXphZG9zOyBwYXJhIHRvZG9zIGxvcyBlZmVjdG9zIGxhIHVuaXZlcnNpZGFkIGFjdMO6YSBjb21vIHVuIHRlcmNlcm8gZGUgYnVlbmEgZmUuIAoKRUwgQVVUT1IsIGF1dG9yaXphIGEgTEEgVU5JVkVSU0lEQUQgREVMIFJPU0FSSU8sICBwYXJhIHF1ZSBlbiBsb3MgdMOpcm1pbm9zIGVzdGFibGVjaWRvcyBlbiBsYSBMZXkgMjMgZGUgMTk4MiwgTGV5IDQ0IGRlIDE5OTMsIERlY2lzacOzbiBhbmRpbmEgMzUxIGRlIDE5OTMsIERlY3JldG8gNDYwIGRlIDE5OTUgeSBkZW3DoXMgbm9ybWFzIGdlbmVyYWxlcyBzb2JyZSBsYSBtYXRlcmlhLCAgdXRpbGljZSB5IHVzZSBsYSBvYnJhIG9iamV0byBkZSBsYSBwcmVzZW50ZSBhdXRvcml6YWNpw7NuLgoKLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KClBPTElUSUNBIERFIFRSQVRBTUlFTlRPIERFIERBVE9TIFBFUlNPTkFMRVMuIERlY2xhcm8gcXVlIGF1dG9yaXpvIHByZXZpYSB5IGRlIGZvcm1hIGluZm9ybWFkYSBlbCB0cmF0YW1pZW50byBkZSBtaXMgZGF0b3MgcGVyc29uYWxlcyBwb3IgcGFydGUgZGUgTEEgVU5JVkVSU0lEQUQgREVMIFJPU0FSSU8gIHBhcmEgZmluZXMgYWNhZMOpbWljb3MgeSBlbiBhcGxpY2FjacOzbiBkZSBjb252ZW5pb3MgY29uIHRlcmNlcm9zIG8gc2VydmljaW9zIGNvbmV4b3MgY29uIGFjdGl2aWRhZGVzIHByb3BpYXMgZGUgbGEgYWNhZGVtaWEsIGNvbiBlc3RyaWN0byBjdW1wbGltaWVudG8gZGUgbG9zIHByaW5jaXBpb3MgZGUgbGV5LiBQYXJhIGVsIGNvcnJlY3RvIGVqZXJjaWNpbyBkZSBtaSBkZXJlY2hvIGRlIGhhYmVhcyBkYXRhICBjdWVudG8gY29uIGxhIGN1ZW50YSBkZSBjb3JyZW8gaGFiZWFzZGF0YUB1cm9zYXJpby5lZHUuY28sIGRvbmRlIHByZXZpYSBpZGVudGlmaWNhY2nDs24gIHBvZHLDqSBzb2xpY2l0YXIgbGEgY29uc3VsdGEsIGNvcnJlY2Npw7NuIHkgc3VwcmVzacOzbiBkZSBtaXMgZGF0b3MuCgo=

FCTNLP: Fighting cyberterrorism with natural language processing

Publicaciones similares