FCTNLP: Fighting cyberterrorism with natural language processing
Las redes sociales son una rica fuente de datos y han sido utilizadas para promover u organizar ciberdelitos que afectan al mundo real. Por ello, las fuerzas del orden se interesan por la información crucial que puede obtenerse de estas fuentes. La cantidad de información y el lenguaje informal que...
- Autores:
- Tipo de recurso:
- Fecha de publicación:
- 2021
- Institución:
- Universidad del Rosario
- Repositorio:
- Repositorio EdocUR - U. Rosario
- Idioma:
- eng
- OAI Identifier:
- oai:repository.urosario.edu.co:10336/34736
- Acceso en línea:
- https://doi.org/10.48713/10336_34736
https://repository.urosario.edu.co/handle/10336/34736
- Palabra clave:
- OSINT
NER
Ciberterrorismo
Procesamiento de Lenguaje Natural
Similitud semántica
Análisis de sentimientos
Matemáticas
Cyberterrorism
OSINT
NLP
NER
Natural Language Processing
Sentiment Analysis
Semantic Similarity
- Rights
- License
- Abierto (Texto Completo)
id |
EDOCUR2_22053906f36b1778f0f75990ce591818 |
---|---|
oai_identifier_str |
oai:repository.urosario.edu.co:10336/34736 |
network_acronym_str |
EDOCUR2 |
network_name_str |
Repositorio EdocUR - U. Rosario |
repository_id_str |
|
dc.title.es.fl_str_mv |
FCTNLP: Fighting cyberterrorism with natural language processing |
dc.title.TranslatedTitle.es.fl_str_mv |
FCTNLP: Luchando contra el ciberterrorismo con procesamiento de lenguaje natural |
title |
FCTNLP: Fighting cyberterrorism with natural language processing |
spellingShingle |
FCTNLP: Fighting cyberterrorism with natural language processing OSINT NER Ciberterrorismo Procesamiento de Lenguaje Natural Similitud semántica Análisis de sentimientos Matemáticas Cyberterrorism OSINT NLP NER Natural Language Processing Sentiment Analysis Semantic Similarity |
title_short |
FCTNLP: Fighting cyberterrorism with natural language processing |
title_full |
FCTNLP: Fighting cyberterrorism with natural language processing |
title_fullStr |
FCTNLP: Fighting cyberterrorism with natural language processing |
title_full_unstemmed |
FCTNLP: Fighting cyberterrorism with natural language processing |
title_sort |
FCTNLP: Fighting cyberterrorism with natural language processing |
dc.contributor.advisor.none.fl_str_mv |
Díaz López, Daniel Orlando |
dc.subject.es.fl_str_mv |
OSINT NER Ciberterrorismo Procesamiento de Lenguaje Natural Similitud semántica Análisis de sentimientos |
topic |
OSINT NER Ciberterrorismo Procesamiento de Lenguaje Natural Similitud semántica Análisis de sentimientos Matemáticas Cyberterrorism OSINT NLP NER Natural Language Processing Sentiment Analysis Semantic Similarity |
dc.subject.ddc.es.fl_str_mv |
Matemáticas |
dc.subject.keyword.es.fl_str_mv |
Cyberterrorism OSINT NLP NER Natural Language Processing Sentiment Analysis Semantic Similarity |
description |
Las redes sociales son una rica fuente de datos y han sido utilizadas para promover u organizar ciberdelitos que afectan al mundo real. Por ello, las fuerzas del orden se interesan por la información crucial que puede obtenerse de estas fuentes. La cantidad de información y el lenguaje informal que se utiliza para difundir la información hace que el Procesamiento del Lenguaje Natural (PLN) sea una excelente herramienta para realizar análisis sobre las publicaciones en las redes sociales. Por ello, en esta propuesta se integra una arquitectura con tres modelos de PLN para proporcionar un análisis exhaustivo de fuentes abiertas como los medios sociales. Este análisis extrae entidades del texto, identifica clusters de usuarios y su respectiva polaridad, finalmente todos los resultados se relacionan en una base de datos gráfica. Esta arquitectura se puso a prueba utilizando datos de un escenario real para determinar su viabilidad. |
publishDate |
2021 |
dc.date.created.none.fl_str_mv |
2021-11-26 |
dc.date.accessioned.none.fl_str_mv |
2022-08-22T19:11:40Z |
dc.date.available.none.fl_str_mv |
2022-08-22T19:11:40Z |
dc.type.es.fl_str_mv |
bachelorThesis |
dc.type.coar.fl_str_mv |
http://purl.org/coar/resource_type/c_7a1f |
dc.type.document.es.fl_str_mv |
Trabajo de grado |
dc.type.spa.es.fl_str_mv |
Trabajo de grado |
dc.identifier.doi.none.fl_str_mv |
https://doi.org/10.48713/10336_34736 |
dc.identifier.uri.none.fl_str_mv |
https://repository.urosario.edu.co/handle/10336/34736 |
url |
https://doi.org/10.48713/10336_34736 https://repository.urosario.edu.co/handle/10336/34736 |
dc.language.iso.es.fl_str_mv |
eng |
language |
eng |
dc.rights.coar.fl_str_mv |
http://purl.org/coar/access_right/c_abf2 |
dc.rights.acceso.es.fl_str_mv |
Abierto (Texto Completo) |
rights_invalid_str_mv |
Abierto (Texto Completo) http://purl.org/coar/access_right/c_abf2 |
dc.format.extent.es.fl_str_mv |
26 pp |
dc.format.mimetype.es.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Universidad del Rosario |
dc.publisher.department.none.fl_str_mv |
Escuela de Ingeniería, Ciencia y Tecnología |
dc.publisher.program.none.fl_str_mv |
Programa de Matemáticas Aplicadas y Ciencias de la Computación - MACC |
publisher.none.fl_str_mv |
Universidad del Rosario |
institution |
Universidad del Rosario |
dc.source.bibliographicCitation.es.fl_str_mv |
Council of Europe. Explanatory Report to the Convention on Cybercrime. https://rm.coe.int/CoERMPublicCommonSearc hServices/DisplayDCTMContent?documentId=09000016800c ce5b. 2001. Akhilesh Chandra and Melissa J. Snowe. “A taxonomy of cybercrime: Theory and design”. In: International Journal of Accounting Information Systems 38 (2020). 2019 UW CISA Symposium, p. 100467. issn: 1467-0895. doi: 10.1016/j.acc inf.2020.100467. url: https://www.sciencedirect.com/sc ience/article/pii/S1467089520300348. Jo˜ao Rafael Gon¸calves Evangelista et al. “Systematic literature review to investigate the application of open source intelligence (osint) with artificial intelligence”. In: Journal of Applied Security Research (2020), pp. 1–25. Heather J Williams and Ilana Blum. Defining second generation open source intelligence (OSINT) for the defense enterprise. Tech. rep. RAND Corporation Santa Monica United States, 2018. Javier Pastor-Galindo et al. “The Not Yet Exploited Goldmine of OSINT: Opportunities, Open Challenges and Future Trends”. In: IEEE Access 8 (2020), pp. 10282–10304. doi: 10.1109/ACCESS.2020.2965257. A. Thomas. Natural Language Processing with Spark NLP: Learning to Understand Text at Scale. O’Reilly Media, 2020. isbn: 9781492047766. url: https://books.google.com.co/b ooks?id=sJw6zQEACAAJ. Leigh Clark et al. “What Makes a Good Conversation? Challenges in Designing Truly Conversational Agents”. In: New York, NY, USA: Association for Computing Machinery, 2019, 1–12. isbn: 9781450359702. url: 10.1145.3290605.3300705. Swati Kumari, Zia Saquib, and Sanjay Pawar. Machine Learning Approach for Text Classification in Cybercrime. 2018. doi: 10.1109/ICCUBEA.2018.8697442. C. S´anchez-Rebollo et al. “Detection of Jihadism in Social Networks Using Big Data Techniques Supported by Graphs and Fuzzy Clustering”. In: Hindawi 2019.1238780 (2019), p. 13. doi: 10.1155.2019.1238780 Ibrahim Aljarah et al. “Intelligent detection of hate speech in Arabic social network: A machine learning approach”. In: Journal of Information Science 47.4 (2021), pp. 483–501. doi: 10.1177/0165551520917651. eprint: https://doi.org/10.11 77/0165551520917651. url: https://doi.org/10.1177/0165 551520917651. Iv´an Castillo-Z´u˜niga et al. “Internet Data Analysis Methodology for Cyberterrorism Vocabulary Detection, Combining Techniques of Big Data Analytics, NLP and Semantic Web”. In: International Journal on Semantic Web and Information Systems 16 (Jan. 2020), pp. 69–86. doi: 10.4018/IJSWIS.20 20010104. C Oleji et al. “Big data Analitic of Boko Haram insurgency attacks menace in nigeria using DynamicK-reference clustering algorithm”. In: 7 (Apr. 2020), pp. 1099–1107. V. N. Uzel, E. Sara¸c E¸ssiz, and S. Ay¸se Ozel. “Using Fuzzy ¨ Sets for Detecting Cyber Terrorism and Extremism in the Text”. In: 2018 Innovations in Intelligent Systems and Applications Conference (ASYU). 2018, pp. 1–4. doi: 10.1109 /ASYU.2018.8554017. Sanjeev J Wagh, Manisha S Bhende, and Anuradha D Thakare. Fundamentals of Data Science. Chapman and Hall/CRC, 2021, p. 14. |
dc.source.instname.none.fl_str_mv |
instname:Universidad del Rosario |
dc.source.reponame.none.fl_str_mv |
reponame:Repositorio Institucional EdocUR |
bitstream.url.fl_str_mv |
https://repository.urosario.edu.co/bitstreams/a11db3ac-50ce-4855-982c-5b5b9ee1dd0b/download https://repository.urosario.edu.co/bitstreams/66d8ec5e-4a2b-45a2-95a7-a808c501a9b1/download https://repository.urosario.edu.co/bitstreams/33c03e4c-42f4-4ae5-a9ac-b23bcf5373d9/download https://repository.urosario.edu.co/bitstreams/c8036ac3-4e9f-4a00-b58b-28d6cf9c6d3a/download |
bitstream.checksum.fl_str_mv |
9a89f0963c2d666d4d72a92fa2bcd972 fab9d9ed61d64f6ac005dee3306ae77e 28156dec299671386880e223c484b537 f2b4a3325165039f9094e3685eaa5759 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositorio institucional EdocUR |
repository.mail.fl_str_mv |
edocur@urosario.edu.co |
_version_ |
1814167428213506048 |
spelling |
Díaz López, Daniel Orlando1061695713600Zapata Rozo, Andrés FelipeProfesional en Matemáticas Aplicadas y Ciencias de la ComputaciónPregradoFull time3e71ceeb-e70b-4f66-b11d-79c57e843d4c6002022-08-22T19:11:40Z2022-08-22T19:11:40Z2021-11-26Las redes sociales son una rica fuente de datos y han sido utilizadas para promover u organizar ciberdelitos que afectan al mundo real. Por ello, las fuerzas del orden se interesan por la información crucial que puede obtenerse de estas fuentes. La cantidad de información y el lenguaje informal que se utiliza para difundir la información hace que el Procesamiento del Lenguaje Natural (PLN) sea una excelente herramienta para realizar análisis sobre las publicaciones en las redes sociales. Por ello, en esta propuesta se integra una arquitectura con tres modelos de PLN para proporcionar un análisis exhaustivo de fuentes abiertas como los medios sociales. Este análisis extrae entidades del texto, identifica clusters de usuarios y su respectiva polaridad, finalmente todos los resultados se relacionan en una base de datos gráfica. Esta arquitectura se puso a prueba utilizando datos de un escenario real para determinar su viabilidad.The social networks are a rich source of data and have been used to promote or organize cybercrimes that affect the real world. Because of this, the law enforcement agency are interest in the crucial information that can be get on this sources. The amount of information and the informal language which is used to spread information makes the Natural Language Processing (NLP) and excellent tool to make analysis over post in social media. That is why, in this proposal an architecture with three NLP models are integrated to provide an exhaustive analysis from open sources like social media. This analysis extract entities from the text, identifies clusters of users and their respective polarity, finally all of the results are related in a graph database. This architecture was under test using data from a real scenario in order to determine their feasibility.26 ppapplication/pdfhttps://doi.org/10.48713/10336_34736https://repository.urosario.edu.co/handle/10336/34736engUniversidad del RosarioEscuela de Ingeniería, Ciencia y TecnologíaPrograma de Matemáticas Aplicadas y Ciencias de la Computación - MACCAbierto (Texto Completo)http://purl.org/coar/access_right/c_abf2Council of Europe. Explanatory Report to the Convention on Cybercrime. https://rm.coe.int/CoERMPublicCommonSearc hServices/DisplayDCTMContent?documentId=09000016800c ce5b. 2001.Akhilesh Chandra and Melissa J. Snowe. “A taxonomy of cybercrime: Theory and design”. In: International Journal of Accounting Information Systems 38 (2020). 2019 UW CISA Symposium, p. 100467. issn: 1467-0895. doi: 10.1016/j.acc inf.2020.100467. url: https://www.sciencedirect.com/sc ience/article/pii/S1467089520300348.Jo˜ao Rafael Gon¸calves Evangelista et al. “Systematic literature review to investigate the application of open source intelligence (osint) with artificial intelligence”. In: Journal of Applied Security Research (2020), pp. 1–25.Heather J Williams and Ilana Blum. Defining second generation open source intelligence (OSINT) for the defense enterprise. Tech. rep. RAND Corporation Santa Monica United States, 2018.Javier Pastor-Galindo et al. “The Not Yet Exploited Goldmine of OSINT: Opportunities, Open Challenges and Future Trends”. In: IEEE Access 8 (2020), pp. 10282–10304. doi: 10.1109/ACCESS.2020.2965257.A. Thomas. Natural Language Processing with Spark NLP: Learning to Understand Text at Scale. O’Reilly Media, 2020. isbn: 9781492047766. url: https://books.google.com.co/b ooks?id=sJw6zQEACAAJ.Leigh Clark et al. “What Makes a Good Conversation? Challenges in Designing Truly Conversational Agents”. In: New York, NY, USA: Association for Computing Machinery, 2019, 1–12. isbn: 9781450359702. url: 10.1145.3290605.3300705.Swati Kumari, Zia Saquib, and Sanjay Pawar. Machine Learning Approach for Text Classification in Cybercrime. 2018. doi: 10.1109/ICCUBEA.2018.8697442.C. S´anchez-Rebollo et al. “Detection of Jihadism in Social Networks Using Big Data Techniques Supported by Graphs and Fuzzy Clustering”. In: Hindawi 2019.1238780 (2019), p. 13. doi: 10.1155.2019.1238780Ibrahim Aljarah et al. “Intelligent detection of hate speech in Arabic social network: A machine learning approach”. In: Journal of Information Science 47.4 (2021), pp. 483–501. doi: 10.1177/0165551520917651. eprint: https://doi.org/10.11 77/0165551520917651. url: https://doi.org/10.1177/0165 551520917651.Iv´an Castillo-Z´u˜niga et al. “Internet Data Analysis Methodology for Cyberterrorism Vocabulary Detection, Combining Techniques of Big Data Analytics, NLP and Semantic Web”. In: International Journal on Semantic Web and Information Systems 16 (Jan. 2020), pp. 69–86. doi: 10.4018/IJSWIS.20 20010104.C Oleji et al. “Big data Analitic of Boko Haram insurgency attacks menace in nigeria using DynamicK-reference clustering algorithm”. In: 7 (Apr. 2020), pp. 1099–1107.V. N. Uzel, E. Sara¸c E¸ssiz, and S. Ay¸se Ozel. “Using Fuzzy ¨ Sets for Detecting Cyber Terrorism and Extremism in the Text”. In: 2018 Innovations in Intelligent Systems and Applications Conference (ASYU). 2018, pp. 1–4. doi: 10.1109 /ASYU.2018.8554017.Sanjeev J Wagh, Manisha S Bhende, and Anuradha D Thakare. Fundamentals of Data Science. Chapman and Hall/CRC, 2021, p. 14.instname:Universidad del Rosarioreponame:Repositorio Institucional EdocUROSINTNERCiberterrorismoProcesamiento de Lenguaje NaturalSimilitud semánticaAnálisis de sentimientosMatemáticas510600CyberterrorismOSINTNLPNERNatural Language ProcessingSentiment AnalysisSemantic SimilarityFCTNLP: Fighting cyberterrorism with natural language processingFCTNLP: Luchando contra el ciberterrorismo con procesamiento de lenguaje naturalbachelorThesisTrabajo de gradoTrabajo de gradohttp://purl.org/coar/resource_type/c_7a1fORIGINALZapataRozo-AndresFelipe-2021.pdfZapataRozo-AndresFelipe-2021.pdfapplication/pdf4171930https://repository.urosario.edu.co/bitstreams/a11db3ac-50ce-4855-982c-5b5b9ee1dd0b/download9a89f0963c2d666d4d72a92fa2bcd972MD51LICENSElicense.txtlicense.txttext/plain1475https://repository.urosario.edu.co/bitstreams/66d8ec5e-4a2b-45a2-95a7-a808c501a9b1/downloadfab9d9ed61d64f6ac005dee3306ae77eMD52TEXTZapataRozo-AndresFelipe-2021.pdf.txtZapataRozo-AndresFelipe-2021.pdf.txtExtracted texttext/plain70274https://repository.urosario.edu.co/bitstreams/33c03e4c-42f4-4ae5-a9ac-b23bcf5373d9/download28156dec299671386880e223c484b537MD53THUMBNAILZapataRozo-AndresFelipe-2021.pdf.jpgZapataRozo-AndresFelipe-2021.pdf.jpgGenerated Thumbnailimage/jpeg2956https://repository.urosario.edu.co/bitstreams/c8036ac3-4e9f-4a00-b58b-28d6cf9c6d3a/downloadf2b4a3325165039f9094e3685eaa5759MD5410336/34736oai:repository.urosario.edu.co:10336/347362022-08-31 07:52:15.88https://repository.urosario.edu.coRepositorio institucional EdocURedocur@urosario.edu.coRUwoTE9TKSBBVVRPUihFUyksIG1hbmlmaWVzdGEobWFuaWZlc3RhbW9zKSBxdWUgbGEgb2JyYSBvYmpldG8gZGUgbGEgcHJlc2VudGUgYXV0b3JpemFjacOzbiBlcyBvcmlnaW5hbCB5IGxhIHJlYWxpesOzIHNpbiB2aW9sYXIgbyB1c3VycGFyIGRlcmVjaG9zIGRlIGF1dG9yIGRlIHRlcmNlcm9zLCBwb3IgbG8gdGFudG8gbGEgb2JyYSBlcyBkZSBleGNsdXNpdmEgYXV0b3LDrWEgeSB0aWVuZSBsYSB0aXR1bGFyaWRhZCBzb2JyZSBsYSBtaXNtYS4gCgpQQVJHUkFGTzogRW4gY2FzbyBkZSBwcmVzZW50YXJzZSBjdWFscXVpZXIgcmVjbGFtYWNpw7NuIG8gYWNjacOzbiBwb3IgcGFydGUgZGUgdW4gdGVyY2VybyBlbiBjdWFudG8gYSBsb3MgZGVyZWNob3MgZGUgYXV0b3Igc29icmUgbGEgb2JyYSBlbiBjdWVzdGnDs24sIEVMIEFVVE9SLCBhc3VtaXLDoSB0b2RhIGxhIHJlc3BvbnNhYmlsaWRhZCwgeSBzYWxkcsOhIGVuIGRlZmVuc2EgZGUgbG9zIGRlcmVjaG9zIGFxdcOtIGF1dG9yaXphZG9zOyBwYXJhIHRvZG9zIGxvcyBlZmVjdG9zIGxhIHVuaXZlcnNpZGFkIGFjdMO6YSBjb21vIHVuIHRlcmNlcm8gZGUgYnVlbmEgZmUuIAoKRUwgQVVUT1IsIGF1dG9yaXphIGEgTEEgVU5JVkVSU0lEQUQgREVMIFJPU0FSSU8sICBwYXJhIHF1ZSBlbiBsb3MgdMOpcm1pbm9zIGVzdGFibGVjaWRvcyBlbiBsYSBMZXkgMjMgZGUgMTk4MiwgTGV5IDQ0IGRlIDE5OTMsIERlY2lzacOzbiBhbmRpbmEgMzUxIGRlIDE5OTMsIERlY3JldG8gNDYwIGRlIDE5OTUgeSBkZW3DoXMgbm9ybWFzIGdlbmVyYWxlcyBzb2JyZSBsYSBtYXRlcmlhLCAgdXRpbGljZSB5IHVzZSBsYSBvYnJhIG9iamV0byBkZSBsYSBwcmVzZW50ZSBhdXRvcml6YWNpw7NuLgoKLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0KClBPTElUSUNBIERFIFRSQVRBTUlFTlRPIERFIERBVE9TIFBFUlNPTkFMRVMuIERlY2xhcm8gcXVlIGF1dG9yaXpvIHByZXZpYSB5IGRlIGZvcm1hIGluZm9ybWFkYSBlbCB0cmF0YW1pZW50byBkZSBtaXMgZGF0b3MgcGVyc29uYWxlcyBwb3IgcGFydGUgZGUgTEEgVU5JVkVSU0lEQUQgREVMIFJPU0FSSU8gIHBhcmEgZmluZXMgYWNhZMOpbWljb3MgeSBlbiBhcGxpY2FjacOzbiBkZSBjb252ZW5pb3MgY29uIHRlcmNlcm9zIG8gc2VydmljaW9zIGNvbmV4b3MgY29uIGFjdGl2aWRhZGVzIHByb3BpYXMgZGUgbGEgYWNhZGVtaWEsIGNvbiBlc3RyaWN0byBjdW1wbGltaWVudG8gZGUgbG9zIHByaW5jaXBpb3MgZGUgbGV5LiBQYXJhIGVsIGNvcnJlY3RvIGVqZXJjaWNpbyBkZSBtaSBkZXJlY2hvIGRlIGhhYmVhcyBkYXRhICBjdWVudG8gY29uIGxhIGN1ZW50YSBkZSBjb3JyZW8gaGFiZWFzZGF0YUB1cm9zYXJpby5lZHUuY28sIGRvbmRlIHByZXZpYSBpZGVudGlmaWNhY2nDs24gIHBvZHLDqSBzb2xpY2l0YXIgbGEgY29uc3VsdGEsIGNvcnJlY2Npw7NuIHkgc3VwcmVzacOzbiBkZSBtaXMgZGF0b3MuCgo= |