Desarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elemental

Actualmente, existen modelos de lenguaje integrados en sistemas que pueden superar las capacidades humanas en una variedad de pruebas. Sin embargo, ¿cómo podemos medir la coherencia de estos modelos? En este trabajo, proponemos un enfoque que utiliza la arquitectura de transformers para abordar el p...

Full description

Autores:: Sánchez Tovar, Edwin Alejandro

Tipo de recurso:: https://purl.org/coar/resource_type/c_7a1f

Fecha de publicación:: 2024

Institución:: Universidad El Bosque

Repositorio:: Repositorio U. El Bosque

Idioma:: spa

id	UNBOSQUE2_c344368f440204592b4d09e32c5dd4ae
oai_identifier_str	oai:repositorio.unbosque.edu.co:20.500.12495/13595
network_acronym_str	UNBOSQUE2
network_name_str	Repositorio U. El Bosque
repository_id_str
dc.title.none.fl_str_mv	Desarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elemental
dc.title.translated.none.fl_str_mv	Development of a model for measuring logical implication in elementary mathematics problems
title	Desarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elemental
spellingShingle	Desarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elemental Axiomas e IA Implicación lógica IA en matemáticas Aprendizaje automático Aprendizaje profundo Inteligencia artificial Modelos de lenguaje 510 Axioms and AI Logical implication AI in mathematics Machine learning Deep learning Artificial intelligence Language model
title_short	Desarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elemental
title_full	Desarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elemental
title_fullStr	Desarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elemental
title_full_unstemmed	Desarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elemental
title_sort	Desarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elemental
dc.creator.fl_str_mv	Sánchez Tovar, Edwin Alejandro
dc.contributor.advisor.none.fl_str_mv	González Galeano, Andrei Alain
dc.contributor.author.none.fl_str_mv	Sánchez Tovar, Edwin Alejandro
dc.subject.none.fl_str_mv	Axiomas e IA Implicación lógica IA en matemáticas Aprendizaje automático Aprendizaje profundo Inteligencia artificial Modelos de lenguaje
topic	Axiomas e IA Implicación lógica IA en matemáticas Aprendizaje automático Aprendizaje profundo Inteligencia artificial Modelos de lenguaje 510 Axioms and AI Logical implication AI in mathematics Machine learning Deep learning Artificial intelligence Language model
dc.subject.ddc.none.fl_str_mv	510
dc.subject.keywords.none.fl_str_mv	Axioms and AI Logical implication AI in mathematics Machine learning Deep learning Artificial intelligence Language model
description	Actualmente, existen modelos de lenguaje integrados en sistemas que pueden superar las capacidades humanas en una variedad de pruebas. Sin embargo, ¿cómo podemos medir la coherencia de estos modelos? En este trabajo, proponemos un enfoque que utiliza la arquitectura de transformers para abordar el problema de la implicación lógica (IL), es decir, determinar qué oraciones se derivan de otras dentro de un texto. Esto se logra mediante el uso de su mecanismo de atención y predicción del siguiente token. Se encontró que, con un modelo muy simple basado en la arquitectura del transformer, es posible la identificación de la IL en problemas de conteo y probabilidad con una precisión del 60 % en una muestra de 95 ejercicios matemáticos de diversos temas. Este método podría contribuir a mejorar la precisión con la que se evalúa la coherencia de los modelos de lenguaje, proporcionando los datos necesarios para realizar un análisis detallado de sus errores y examinar la validez lógica de sus respuestas correctas.
publishDate	2024
dc.date.accessioned.none.fl_str_mv	2024-12-05T14:25:31Z
dc.date.available.none.fl_str_mv	2024-12-05T14:25:31Z
dc.date.issued.none.fl_str_mv	2024-11
dc.type.coar.fl_str_mv	http://purl.org/coar/resource_type/c_7a1f
dc.type.local.spa.fl_str_mv	Tesis/Trabajo de grado - Monografía - Pregrado
dc.type.coar.none.fl_str_mv	https://purl.org/coar/resource_type/c_7a1f
dc.type.driver.none.fl_str_mv	info:eu-repo/semantics/bachelorThesis
dc.type.coarversion.none.fl_str_mv	https://purl.org/coar/version/c_ab4af688f83e57aa
format	https://purl.org/coar/resource_type/c_7a1f
dc.identifier.uri.none.fl_str_mv	https://hdl.handle.net/20.500.12495/13595
dc.identifier.instname.spa.fl_str_mv	instname:Universidad El Bosque
dc.identifier.reponame.spa.fl_str_mv	reponame:Repositorio Institucional Universidad El Bosque
dc.identifier.repourl.none.fl_str_mv	repourl:https://repositorio.unbosque.edu.co
url	https://hdl.handle.net/20.500.12495/13595
identifier_str_mv	instname:Universidad El Bosque reponame:Repositorio Institucional Universidad El Bosque repourl:https://repositorio.unbosque.edu.co
dc.language.iso.fl_str_mv	spa
language	spa
dc.relation.references.none.fl_str_mv	Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural Machine Translation by Jointly Learning to Align and Translate [arXiv:1409.0473]. Proceedings of the International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1409.047 Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A Neural Probabilistic Language Model. Journal of Machine Learning Research, 3,1137-1155. Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5, 135-146. https://doi.org/10.1162/tacl a 00051 Bos, J., Markert, K., & Van Noord, G. (2014). Logical Natural Language Inference. Journal of Logic, Language and Information, 23 (4), 431-445 Bottou, L. (2010). Large-Scale Machine Learning with Stochastic Gradient Descent. En Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT 2010) (pp. 177-186). Chang, R., & Jungnickel, D. (2008). Matematicas para Ciencias de la Computacion. McGraw-Hill. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002).SMOTE: Synthetic minority over-sampling technique. Journal of artificial intelligence research, 321-357. Chen, S. F., & Goodman, J. (1999). An Empirical Study of Smoothing Techniques for Language Modeling. Computer Speech & Language, 13 (4),359-394. de Moura, L., & Ullrich, S. (2023). The Lean Theorem Prover Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805. https://arxiv.org/abs/1810.04805 Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. Journal of Machine Learning Research, 12, 2121-2159 Elkan, C. (2001). The foundations of cost-sensitive learning. Proceedings of the 17th international joint conference on Artificial intelligence, 2,973-978. Euclides. (1956). The Thirteen Books of Euclid’s Elements (T. L. Heath, Ed.;2nd) [Originally published in 1908]. Dover Publications. Goldberg, Y. (2016). A Primer on Neural Network Models for Natural Language Processing. Journal of Artificial Intelligence Research, 57, 345-420.https://doi.org/10.1613/jair.4992 Gauss, C. F. (1809). Theoria Motus Corporum Coelestium in Sectionibus Conicis Solum. Publisher Name Goodfellow, I., Bengio, Y., & Courville, A. (2013). Dropout Training as Adaptive Regularization. MIT Press. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. He, H., & Bai, Y. (2008). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 1322-1328. He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21 (9), 1263-1284 Hendrycks, D., Burns, C., Kadavath, S., Arora, A., Basart, S., Tang, E., Song, D., & Steinhardt, J. (2021). Measuring Mathematical Problem Solving With the MATH Dataset. arXiv preprint arXiv:2103.03874. Hernandez Sampieri, R., Fernandez Collado, C., & Baptista Lucio, P. (2018). Metodologıa de la investigacion (6th). McGraw-Hill Education Hinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2012). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Proceedings of the 27th International Conference on Machine Learning (ICML 2010), 192-200 Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9 (8), 1735-1780. Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. International conference on machine learning, 448-456 Jina Development Team. (2020). Jina: An Open-Source Neural Search Framework. Jurafsky, D., & Martin, J. H. (2008). Speech and Language Processing (2nd). Pearson Prentice Hall Karpathy, A. (2023). NanoGPT Kingma, D. P., & Ba, J. (2014). Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (ICLR) Kudo, T., & Richardson, J. (2018). SentencePiece: A Simple and Language Independent Subword Tokenizer and Detokenizer for Neural Text Processing. arXiv preprint arXiv:1808.06226. https://arxiv.org/abs/ 1808.06226 Kukar, M., & Kononenko, I. (1998). Cost-sensitive learning with neural networks. European conference on machine learning, 445-452. Manning, C. D., Raghavan, P., & Schutze, H. (2008). Introduction to Information Retrieval. The MIT Press. McCulloch, W., & Pitts, W. (1943). A Logical Calculus of the Ideas Immanent in Nervous Activity. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781. https://arxiv.org/abs/1301.3781 Mitchell, T. M. (1997). Machine Learning. McGraw-Hill. Pennington, J., Socher, R., & Manning, C. D. (2014). GloVe: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP),1532-1543. https://doi.org/10.3115/v1/D14-1162 Powers, D. M. W. (2011). Model Evaluation: From Precision, Recall and FMeasure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies, 2 (1). https://doi.org/10.1.1.189.2560 Price, C. (2023). g4dn.xlarge - AWS EC2 Instance Prices [n.d.]. https : / /cloudprice.net/aws/ec2/instances/g4dn.xlarge Rawte, V., Islam Tonmoy, S. M. T., Chadha, A., & Sheth, A. (2024). FACtual enTailment fOr hallucInation Detection [Preprint]. https://doi.org/10.13140/RG.2.2.24327.82080 Rocktaschel, T., Grefenstette, E., Hermann, K. M., Koˇcisk´y, T., Blunsom, P., & de Freitas, N. (2015). Reasoning About Entailment with Neural Attention. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), 632-642. Rojo, A. (2012). Algebra II. Editorial Universitaria. Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533-536. https ://doi.org/10.1038/323533a0 Sennrich, R., Haddow, B., & Birch, A. (2016). Neural Machine Translation of Rare Words with Subword Units. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1715-1725. https://doi.org/10.18653/v1/P16-1162 Shannon, C. E. (1948). A Mathematical Theory of Communication. The Bell System Technical Journal, 27 (3), 379-423 Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT Press. Tieleman, T., & Haffner, P. (2012). Lecture 6.5 - RMSProp: Divide the Gradient by a Running Average of its Recent Magnitude. Neural Networks for Machine Learning. Van den Bosch, A. (2013). A survey of stochastic methods for optimization. Journal of Machine Learning Research, 16, 123-133. Zhou, Z.-H., & Liu, X.-Y. (2006). Multi-class cost-sensitive neural networks with softmax loss. Neurocomputing, 69 (16-18), 2415-2418 Vapnik, V. N. (1998). Statistical Learning Theory. Wiley. Vaswani, A., Shard, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Klatz, H., et al. (2017). Attention is All You Need. Advances in Neural Information Processing Systems, 30. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is All You Need. Advances in Neural Information Processing Systems (NeurIPS). Wang, W., Lan, Z., Tan, W., Li, M., Tur, D., & Liu, P. F. (2020). MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. Findings of EMNLP. Werbos, P. J. (1974). Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. ProQuest. Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, L., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., . . . Dean, J. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv preprint arXiv:1609.08144. https://arxiv.org/abs/1609.08144 Yuan, Y., Liu, X., Dikubab, W., Liu, H., Ji, Z., Wu, Z., & Bai, X. (2022). Syntax-Aware Network for Handwritten Mathematical Expression Recognition. arXiv preprint arXiv:2203.01601. Fraleigh, J. B. (2003). A First Course in Abstract Algebra. Addison-Wesley. Manning, C. D., & Sch¨utze, H. (1999). Foundations of Statistical Natural Language Processing. MIT Press. Zhou, Z.-H., & Liu, X.-Y. (2006). Multi-class cost-sensitive neural networks with softmax loss. Neurocomputing, 69 (16-18), 2415-2418.
dc.rights.en.fl_str_mv	Attribution 4.0 International
dc.rights.uri.none.fl_str_mv	http://creativecommons.org/licenses/by/4.0/
dc.rights.local.spa.fl_str_mv	Acceso abierto
dc.rights.accessrights.none.fl_str_mv	info:eu-repo/semantics/openAccess http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv	Attribution 4.0 International http://creativecommons.org/licenses/by/4.0/ Acceso abierto http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv	openAccess
dc.format.mimetype.none.fl_str_mv	application/pdf
dc.publisher.program.spa.fl_str_mv	Matemáticas
dc.publisher.grantor.spa.fl_str_mv	Universidad El Bosque
dc.publisher.faculty.spa.fl_str_mv	Facultad de Ciencias
institution	Universidad El Bosque
bitstream.url.fl_str_mv	https://repositorio.unbosque.edu.co/bitstreams/c67ffc06-d933-4873-a241-2cb4f22ea4cb/download https://repositorio.unbosque.edu.co/bitstreams/645ea774-75ef-468d-a30f-6f50de54499c/download https://repositorio.unbosque.edu.co/bitstreams/fe8ab0b3-dcb7-45bd-b4b6-32b02541b64c/download https://repositorio.unbosque.edu.co/bitstreams/ce20f94b-cc8d-46c6-b0d0-5de0e4690556/download https://repositorio.unbosque.edu.co/bitstreams/268f8847-d78b-4839-907c-bbb456c6e73d/download https://repositorio.unbosque.edu.co/bitstreams/cb055def-f8c8-45c3-a1c4-1f6939a51983/download https://repositorio.unbosque.edu.co/bitstreams/0fda4c5a-b2e3-41a5-81b3-cd58ee06841f/download
bitstream.checksum.fl_str_mv	17cc15b951e7cc6b3728a574117320f9 80a6ec1d0ca4720fbeccf33b9c4a71e7 b0e767d8d958e08b7bcbfbc9242bc5d0 313ea3fe4cd627df823c57a0f12776e5 0092d65aa8f8d560e3203d30d45750dd 5f5cc2bcfe890cb9b2cff2ff12c839de 8ea7bf5df137574032692268fb2cc841
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio Institucional Universidad El Bosque
repository.mail.fl_str_mv	bibliotecas@biteca.com
_version_	1836752110861942784
spelling	González Galeano, Andrei AlainSánchez Tovar, Edwin Alejandro2024-12-05T14:25:31Z2024-12-05T14:25:31Z2024-11https://hdl.handle.net/20.500.12495/13595instname:Universidad El Bosquereponame:Repositorio Institucional Universidad El Bosquerepourl:https://repositorio.unbosque.edu.coActualmente, existen modelos de lenguaje integrados en sistemas que pueden superar las capacidades humanas en una variedad de pruebas. Sin embargo, ¿cómo podemos medir la coherencia de estos modelos? En este trabajo, proponemos un enfoque que utiliza la arquitectura de transformers para abordar el problema de la implicación lógica (IL), es decir, determinar qué oraciones se derivan de otras dentro de un texto. Esto se logra mediante el uso de su mecanismo de atención y predicción del siguiente token. Se encontró que, con un modelo muy simple basado en la arquitectura del transformer, es posible la identificación de la IL en problemas de conteo y probabilidad con una precisión del 60 % en una muestra de 95 ejercicios matemáticos de diversos temas. Este método podría contribuir a mejorar la precisión con la que se evalúa la coherencia de los modelos de lenguaje, proporcionando los datos necesarios para realizar un análisis detallado de sus errores y examinar la validez lógica de sus respuestas correctas.MatemáticoPregradoToday, there are language models built into systems that can outperform human capabilities in a variety of tests. However, how can we measure the coherence of these models? In this work, we propose an approach that uses the transformer architecture to address the problem of logical implication (LI), that is, determining which sentences are derived from others within a text. This is achieved by using its attention mechanism and predicting the next token. It was found that, with a very simple model based on the transformer architecture, the identification of IL in counting and probability problems is possible with an accuracy of 60% in a sample of 95 mathematical exercises on various topics. This method could help improve the precision with which the consistency of language models is evaluated, providing the data necessary to perform a detailed analysis of their errors and examine the logical validity of their correct answers.application/pdfAttribution 4.0 Internationalhttp://creativecommons.org/licenses/by/4.0/Acceso abiertoinfo:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Axiomas e IAImplicación lógicaIA en matemáticasAprendizaje automáticoAprendizaje profundoInteligencia artificialModelos de lenguaje510Axioms and AILogical implicationAI in mathematicsMachine learningDeep learningArtificial intelligenceLanguage modelDesarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elementalDevelopment of a model for measuring logical implication in elementary mathematics problemsMatemáticasUniversidad El BosqueFacultad de CienciasTesis/Trabajo de grado - Monografía - Pregradohttps://purl.org/coar/resource_type/c_7a1fhttp://purl.org/coar/resource_type/c_7a1finfo:eu-repo/semantics/bachelorThesishttps://purl.org/coar/version/c_ab4af688f83e57aaBahdanau, D., Cho, K., & Bengio, Y. (2015). Neural Machine Translation by Jointly Learning to Align and Translate [arXiv:1409.0473]. Proceedings of the International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1409.047Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A Neural Probabilistic Language Model. Journal of Machine Learning Research, 3,1137-1155.Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5, 135-146. https://doi.org/10.1162/tacl a 00051Bos, J., Markert, K., & Van Noord, G. (2014). Logical Natural Language Inference. Journal of Logic, Language and Information, 23 (4), 431-445Bottou, L. (2010). Large-Scale Machine Learning with Stochastic Gradient Descent. En Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT 2010) (pp. 177-186).Chang, R., & Jungnickel, D. (2008). Matematicas para Ciencias de la Computacion. McGraw-Hill.Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002).SMOTE: Synthetic minority over-sampling technique. Journal of artificial intelligence research, 321-357.Chen, S. F., & Goodman, J. (1999). An Empirical Study of Smoothing Techniques for Language Modeling. Computer Speech & Language, 13 (4),359-394.de Moura, L., & Ullrich, S. (2023). The Lean Theorem ProverDevlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805.Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805. https://arxiv.org/abs/1810.04805Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. Journal of Machine Learning Research, 12, 2121-2159Elkan, C. (2001). The foundations of cost-sensitive learning. Proceedings of the 17th international joint conference on Artificial intelligence, 2,973-978.Euclides. (1956). The Thirteen Books of Euclid’s Elements (T. L. Heath, Ed.;2nd) [Originally published in 1908]. Dover Publications.Goldberg, Y. (2016). A Primer on Neural Network Models for Natural Language Processing. Journal of Artificial Intelligence Research, 57, 345-420.https://doi.org/10.1613/jair.4992Gauss, C. F. (1809). Theoria Motus Corporum Coelestium in Sectionibus Conicis Solum. Publisher NameGoodfellow, I., Bengio, Y., & Courville, A. (2013). Dropout Training as Adaptive Regularization. MIT Press.Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.He, H., & Bai, Y. (2008). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 1322-1328.He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21 (9), 1263-1284Hendrycks, D., Burns, C., Kadavath, S., Arora, A., Basart, S., Tang, E., Song, D., & Steinhardt, J. (2021). Measuring Mathematical Problem Solving With the MATH Dataset. arXiv preprint arXiv:2103.03874.Hernandez Sampieri, R., Fernandez Collado, C., & Baptista Lucio, P. (2018). Metodologıa de la investigacion (6th). McGraw-Hill EducationHinton, G., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2012). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Proceedings of the 27th International Conference on Machine Learning (ICML 2010), 192-200Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9 (8), 1735-1780.Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. International conference on machine learning, 448-456Jina Development Team. (2020). Jina: An Open-Source Neural Search Framework.Jurafsky, D., & Martin, J. H. (2008). Speech and Language Processing (2nd). Pearson Prentice HallKarpathy, A. (2023). NanoGPTKingma, D. P., & Ba, J. (2014). Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (ICLR)Kudo, T., & Richardson, J. (2018). SentencePiece: A Simple and Language Independent Subword Tokenizer and Detokenizer for Neural Text Processing. arXiv preprint arXiv:1808.06226. https://arxiv.org/abs/ 1808.06226Kukar, M., & Kononenko, I. (1998). Cost-sensitive learning with neural networks. European conference on machine learning, 445-452.Manning, C. D., Raghavan, P., & Schutze, H. (2008). Introduction to Information Retrieval. The MIT Press.McCulloch, W., & Pitts, W. (1943). A Logical Calculus of the Ideas Immanent in Nervous Activity.Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781. https://arxiv.org/abs/1301.3781Mitchell, T. M. (1997). Machine Learning. McGraw-Hill.Pennington, J., Socher, R., & Manning, C. D. (2014). GloVe: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP),1532-1543. https://doi.org/10.3115/v1/D14-1162Powers, D. M. W. (2011). Model Evaluation: From Precision, Recall and FMeasure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies, 2 (1). https://doi.org/10.1.1.189.2560Price, C. (2023). g4dn.xlarge - AWS EC2 Instance Prices [n.d.]. https : / /cloudprice.net/aws/ec2/instances/g4dn.xlargeRawte, V., Islam Tonmoy, S. M. T., Chadha, A., & Sheth, A. (2024). FACtual enTailment fOr hallucInation Detection [Preprint]. https://doi.org/10.13140/RG.2.2.24327.82080Rocktaschel, T., Grefenstette, E., Hermann, K. M., Koˇcisk´y, T., Blunsom, P., & de Freitas, N. (2015). Reasoning About Entailment with Neural Attention. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), 632-642.Rojo, A. (2012). Algebra II. Editorial Universitaria.Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review.Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533-536. https ://doi.org/10.1038/323533a0Sennrich, R., Haddow, B., & Birch, A. (2016). Neural Machine Translation of Rare Words with Subword Units. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1715-1725. https://doi.org/10.18653/v1/P16-1162Shannon, C. E. (1948). A Mathematical Theory of Communication. The Bell System Technical Journal, 27 (3), 379-423Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT Press.Tieleman, T., & Haffner, P. (2012). Lecture 6.5 - RMSProp: Divide the Gradient by a Running Average of its Recent Magnitude. Neural Networks for Machine Learning.Van den Bosch, A. (2013). A survey of stochastic methods for optimization. Journal of Machine Learning Research, 16, 123-133.Zhou, Z.-H., & Liu, X.-Y. (2006). Multi-class cost-sensitive neural networks with softmax loss. Neurocomputing, 69 (16-18), 2415-2418Vapnik, V. N. (1998). Statistical Learning Theory. Wiley.Vaswani, A., Shard, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Klatz, H., et al. (2017). Attention is All You Need. Advances in Neural Information Processing Systems, 30.Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is All You Need. Advances in Neural Information Processing Systems (NeurIPS).Wang, W., Lan, Z., Tan, W., Li, M., Tur, D., & Liu, P. F. (2020). MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. Findings of EMNLP.Werbos, P. J. (1974). Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. ProQuest.Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, L., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., . . . Dean, J. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv preprint arXiv:1609.08144. https://arxiv.org/abs/1609.08144Yuan, Y., Liu, X., Dikubab, W., Liu, H., Ji, Z., Wu, Z., & Bai, X. (2022). Syntax-Aware Network for Handwritten Mathematical Expression Recognition. arXiv preprint arXiv:2203.01601.Fraleigh, J. B. (2003). A First Course in Abstract Algebra. Addison-Wesley.Manning, C. D., & Sch¨utze, H. (1999). Foundations of Statistical Natural Language Processing. MIT Press.Zhou, Z.-H., & Liu, X.-Y. (2006). Multi-class cost-sensitive neural networks with softmax loss. Neurocomputing, 69 (16-18), 2415-2418.spaLICENSElicense.txtlicense.txttext/plain; charset=utf-82000https://repositorio.unbosque.edu.co/bitstreams/c67ffc06-d933-4873-a241-2cb4f22ea4cb/download17cc15b951e7cc6b3728a574117320f9MD51Anexo 1 Acta de aprobacion.pdfapplication/pdf196447https://repositorio.unbosque.edu.co/bitstreams/645ea774-75ef-468d-a30f-6f50de54499c/download80a6ec1d0ca4720fbeccf33b9c4a71e7MD57Carta de autorizacion.pdfapplication/pdf157106https://repositorio.unbosque.edu.co/bitstreams/fe8ab0b3-dcb7-45bd-b4b6-32b02541b64c/downloadb0e767d8d958e08b7bcbfbc9242bc5d0MD58CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-81019https://repositorio.unbosque.edu.co/bitstreams/ce20f94b-cc8d-46c6-b0d0-5de0e4690556/download313ea3fe4cd627df823c57a0f12776e5MD52ORIGINALTrabajo de grado.pdfTrabajo de grado.pdfapplication/pdf833461https://repositorio.unbosque.edu.co/bitstreams/268f8847-d78b-4839-907c-bbb456c6e73d/download0092d65aa8f8d560e3203d30d45750ddMD53TEXTTrabajo de grado.pdf.txtTrabajo de grado.pdf.txtExtracted texttext/plain100669https://repositorio.unbosque.edu.co/bitstreams/cb055def-f8c8-45c3-a1c4-1f6939a51983/download5f5cc2bcfe890cb9b2cff2ff12c839deMD59THUMBNAILTrabajo de grado.pdf.jpgTrabajo de grado.pdf.jpgGenerated Thumbnailimage/jpeg3536https://repositorio.unbosque.edu.co/bitstreams/0fda4c5a-b2e3-41a5-81b3-cd58ee06841f/download8ea7bf5df137574032692268fb2cc841MD51020.500.12495/13595oai:repositorio.unbosque.edu.co:20.500.12495/135952024-12-06 03:04:31.089http://creativecommons.org/licenses/by/4.0/Attribution 4.0 Internationalopen.accesshttps://repositorio.unbosque.edu.coRepositorio Institucional Universidad El Bosquebibliotecas@biteca.comTGljZW5jaWEgZGUgRGlzdHJpYnVjacOzbiBObyBFeGNsdXNpdmEKClBhcmEgcXVlIGVsIFJlcG9zaXRvcmlvIGRlIGxhIFVuaXZlcnNpZGFkIEVsIEJvc3F1ZSBhIHB1ZWRhIHJlcHJvZHVjaXIgeSBjb211bmljYXIgcMO6YmxpY2FtZW50ZSBzdSBkb2N1bWVudG8gZXMgbmVjZXNhcmlvIGxhIGFjZXB0YWNpw7NuIGRlIGxvcyBzaWd1aWVudGVzIHTDqXJtaW5vcy4gUG9yIGZhdm9yLCBsZWEgbGFzIHNpZ3VpZW50ZXMgY29uZGljaW9uZXMgZGUgbGljZW5jaWE6CgoxLiBBY2VwdGFuZG8gZXN0YSBsaWNlbmNpYSwgdXN0ZWQgKGVsIGF1dG9yL2VzIG8gZWwgcHJvcGlldGFyaW8vcyBkZSBsb3MgZGVyZWNob3MgZGUgYXV0b3IpIGdhcmFudGl6YSBhIGxhIFVuaXZlcnNpZGFkIEVsIEJvc3F1ZSBlbCBkZXJlY2hvIG5vIGV4Y2x1c2l2byBkZSBhcmNoaXZhciwgcmVwcm9kdWNpciwgY29udmVydGlyIChjb21vIHNlIGRlZmluZSBtw6FzIGFiYWpvKSwgY29tdW5pY2FyIHkvbyBkaXN0cmlidWlyIHN1IGRvY3VtZW50byBtdW5kaWFsbWVudGUgZW4gZm9ybWF0byBlbGVjdHLDs25pY28uCgoyLiBUYW1iacOpbiBlc3TDoSBkZSBhY3VlcmRvIGNvbiBxdWUgbGEgVW5pdmVyc2lkYWQgRWwgQm9zcXVlIHB1ZWRhIGNvbnNlcnZhciBtw6FzIGRlIHVuYSBjb3BpYSBkZSBlc3RlIGRvY3VtZW50byB5LCBzaW4gYWx0ZXJhciBzdSBjb250ZW5pZG8sIGNvbnZlcnRpcmxvIGEgY3VhbHF1aWVyIGZvcm1hdG8gZGUgZmljaGVybywgbWVkaW8gbyBzb3BvcnRlLCBwYXJhIHByb3DDs3NpdG9zIGRlIHNlZ3VyaWRhZCwgcHJlc2VydmFjacOzbiB5IGFjY2Vzby4KCjMuIERlY2xhcmEgcXVlIGVsIGRvY3VtZW50byBlcyB1biB0cmFiYWpvIG9yaWdpbmFsIHN1eW8geS9vIHF1ZSB0aWVuZSBlbCBkZXJlY2hvIHBhcmEgb3RvcmdhciBsb3MgZGVyZWNob3MgY29udGVuaWRvcyBlbiBlc3RhIGxpY2VuY2lhLiBUYW1iacOpbiBkZWNsYXJhIHF1ZSBzdSBkb2N1bWVudG8gbm8gaW5mcmluZ2UsIGVuIHRhbnRvIGVuIGN1YW50byBsZSBzZWEgcG9zaWJsZSBzYWJlciwgbG9zIGRlcmVjaG9zIGRlIGF1dG9yIGRlIG5pbmd1bmEgb3RyYSBwZXJzb25hIG8gZW50aWRhZC4KCjQuIFNpIGVsIGRvY3VtZW50byBjb250aWVuZSBtYXRlcmlhbGVzIGRlIGxvcyBjdWFsZXMgbm8gdGllbmUgbG9zIGRlcmVjaG9zIGRlIGF1dG9yLCBkZWNsYXJhIHF1ZSBoYSBvYnRlbmlkbyBlbCBwZXJtaXNvIHNpbiByZXN0cmljY2nDs24gZGVsIHByb3BpZXRhcmlvIGRlIGxvcyBkZXJlY2hvcyBkZSBhdXRvciBwYXJhIG90b3JnYXIgYSBsYSBVbml2ZXJzaWRhZCBFbCBCb3NxdWUgbG9zIGRlcmVjaG9zIHJlcXVlcmlkb3MgcG9yIGVzdGEgbGljZW5jaWEsIHkgcXVlIGVzZSBtYXRlcmlhbCBjdXlvcyBkZXJlY2hvcyBzb24gZGUgdGVyY2Vyb3MgZXN0w6EgY2xhcmFtZW50ZSBpZGVudGlmaWNhZG8geSByZWNvbm9jaWRvIGVuIGVsIHRleHRvIG8gY29udGVuaWRvIGRlbCBkb2N1bWVudG8gZW50cmVnYWRvLgoKNS4gU2kgZWwgZG9jdW1lbnRvIHNlIGJhc2EgZW4gdW5hIG9icmEgcXVlIGhhIHNpZG8gcGF0cm9jaW5hZGEgbyBhcG95YWRhIHBvciB1bmEgYWdlbmNpYSB1IG9yZ2FuaXphY2nDs24gZGlmZXJlbnRlIGRlIGxhIFVuaXZlcnNpZGFkIEVsIEJvc3F1ZSwgc2UgcHJlc3Vwb25lIHF1ZSBzZSBoYSBjdW1wbGlkbyBjb24gY3VhbHF1aWVyIGRlcmVjaG8gZGUgcmV2aXNpw7NuIHUgb3RyYXMgb2JsaWdhY2lvbmVzIHJlcXVlcmlkYXMgcG9yIGVzdGUgY29udHJhdG8gbyBhY3VlcmRvLgoKNi4gVW5pdmVyc2lkYWQgRWwgQm9zcXVlIGlkZW50aWZpY2Fyw6EgY2xhcmFtZW50ZSBzdS9zIG5vbWJyZS9zIGNvbW8gZWwvbG9zIGF1dG9yL2VzIG8gcHJvcGlldGFyaW8vcyBkZSBsb3MgZGVyZWNob3MgZGVsIGRvY3VtZW50bywgeSBubyBoYXLDoSBuaW5ndW5hIGFsdGVyYWNpw7NuIGRlIHN1IGRvY3VtZW50byBkaWZlcmVudGUgYSBsYXMgcGVybWl0aWRhcyBlbiBlc3RhIGxpY2VuY2lhLgo=

Desarrollo de un modelo para la medición de la implicación lógica en problemas de matemática elemental

Publicaciones similares