Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis

Tensors, an extension of arrays, are widely used in a variety of programming tasks. Tensors are the building blocks of many modern machine learning frameworks and are fundamental in the definition of deep learning models. Linters are indispensable tools for today's developers, as they help the...

Full description

Autores:: Cruz Camacho, Elkin Alejandro

Tipo de recurso:

Fecha de publicación:: 2019

Institución:: Universidad Nacional de Colombia

Repositorio:: Universidad Nacional de Colombia

Idioma:: spa

id	UNACIONAL2_7fcde0c2049999d95b18c5f19fbcfdee
oai_identifier_str	oai:repositorio.unal.edu.co:unal/76335
network_acronym_str	UNACIONAL2
network_name_str	Universidad Nacional de Colombia
repository_id_str
dc.title.spa.fl_str_mv	Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis
title	Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis
spellingShingle	Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis Tensors Lenguage Python Programming Tensores Lenguaje Python Programación
title_short	Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis
title_full	Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis
title_fullStr	Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis
title_full_unstemmed	Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis
title_sort	Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis
dc.creator.fl_str_mv	Cruz Camacho, Elkin Alejandro
dc.contributor.author.spa.fl_str_mv	Cruz Camacho, Elkin Alejandro
dc.contributor.spa.fl_str_mv	González Osorio, Fabio Augusto Restrepo Calle, Felipe
dc.subject.proposal.spa.fl_str_mv	Tensors Lenguage Python Programming Tensores Lenguaje Python Programación
topic	Tensors Lenguage Python Programming Tensores Lenguaje Python Programación
description	Tensors, an extension of arrays, are widely used in a variety of programming tasks. Tensors are the building blocks of many modern machine learning frameworks and are fundamental in the definition of deep learning models. Linters are indispensable tools for today's developers, as they help the developers to check code before executing it. Despite the popularity of tensors, linters for Python that check and flag code with tensors are nonexistent. Given the tremendous amount of work done in Python with (and without) tensors, it is quite baffling that little work has been done in this regard. Abstract Interpretation is a methodology/framework for statically analysing code. The idea of Abstract Interpretation is to soundly overapproximate the result of running a piece of code over all possible inputs of the program. A sound overapproximation ensures that the Abstract Interpreter will never omit a true negative, i.e.~if a piece of code is not flagged by the Abstract Interpreter, then it can be safely assumed that the code will not fail. The Abstract Interpreter can be modified so that it only outputs true positives, although losing soundness, i.e.~the interpreter can flag which parts of the code are going to fail no matter how the code is run. In this work, we specify a subset of Python with emphasis on tensor operations. Our operational Python semantics is based on The Python Language Reference. We define an Abstract Interpreter and present its implementation. We show how each part of the Abstract Interpreter was built: the Abstract Domains defined and the abstract semantics. We present the structure of Pytropos, the Abstract Interpreter implemented. Pytropos is able to check NumPy array operations taking into account broadcasting and complex NumPy functions as \texttt{array} and \texttt{dot}. We constructed 74 unit test cases checking the capabilities of Pytropos and 20 property test cases checking compliance with the official Python implementation. We show what and how many bugs the Abstract Interpreter was able to find.
publishDate	2019
dc.date.issued.spa.fl_str_mv	2019
dc.date.accessioned.spa.fl_str_mv	2020-03-30T06:18:19Z
dc.date.available.spa.fl_str_mv	2020-03-30T06:18:19Z
dc.type.spa.fl_str_mv	Trabajo de grado - Maestría
dc.type.driver.spa.fl_str_mv	info:eu-repo/semantics/masterThesis
dc.type.version.spa.fl_str_mv	info:eu-repo/semantics/acceptedVersion
dc.type.content.spa.fl_str_mv	Text
dc.type.redcol.spa.fl_str_mv	http://purl.org/redcol/resource_type/TM
status_str	acceptedVersion
dc.identifier.uri.none.fl_str_mv	https://repositorio.unal.edu.co/handle/unal/76335
dc.identifier.eprints.spa.fl_str_mv	http://bdigital.unal.edu.co/72620/
url	https://repositorio.unal.edu.co/handle/unal/76335 http://bdigital.unal.edu.co/72620/
dc.language.iso.spa.fl_str_mv	spa
language	spa
dc.relation.ispartof.spa.fl_str_mv	Universidad Nacional de Colombia Sede Bogotá Facultad de Ingeniería Departamento de Ingeniería de Sistemas e Industrial Ingeniería de Sistemas Ingeniería de Sistemas
dc.relation.haspart.spa.fl_str_mv	0 Generalidades / Computer science, information and general works 51 Matemáticas / Mathematics 62 Ingeniería y operaciones afines / Engineering
dc.relation.references.spa.fl_str_mv	Cruz Camacho, Elkin Alejandro (2019) Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis. Maestría thesis, Universidad Nacional de Colombia - Sede Bogotá.
dc.rights.spa.fl_str_mv	Derechos reservados - Universidad Nacional de Colombia
dc.rights.coar.fl_str_mv	http://purl.org/coar/access_right/c_abf2
dc.rights.license.spa.fl_str_mv	Atribución-NoComercial 4.0 Internacional
dc.rights.uri.spa.fl_str_mv	http://creativecommons.org/licenses/by-nc/4.0/
dc.rights.accessrights.spa.fl_str_mv	info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Atribución-NoComercial 4.0 Internacional Derechos reservados - Universidad Nacional de Colombia http://creativecommons.org/licenses/by-nc/4.0/ http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv	openAccess
dc.format.mimetype.spa.fl_str_mv	application/pdf
institution	Universidad Nacional de Colombia
bitstream.url.fl_str_mv	https://repositorio.unal.edu.co/bitstream/unal/76335/1/ElkinCruz.2019.pdf https://repositorio.unal.edu.co/bitstream/unal/76335/2/pytropos.zip https://repositorio.unal.edu.co/bitstream/unal/76335/3/ElkinCruz.2019.pdf.jpg
bitstream.checksum.fl_str_mv	8e99ad38089d82738643e96f0ca2d5df 4adffa7f6a6b355ddccae96a75c5e25f f8441b1bfa3b51e5797fca41dd6c72a6
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio Institucional Universidad Nacional de Colombia
repository.mail.fl_str_mv	repositorio_nal@unal.edu.co
_version_	1814090196149338112
spelling	Atribución-NoComercial 4.0 InternacionalDerechos reservados - Universidad Nacional de Colombiahttp://creativecommons.org/licenses/by-nc/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2González Osorio, Fabio AugustoRestrepo Calle, FelipeCruz Camacho, Elkin Alejandro9b89c809-e46d-465e-9c8b-2672f0860eca3002020-03-30T06:18:19Z2020-03-30T06:18:19Z2019https://repositorio.unal.edu.co/handle/unal/76335http://bdigital.unal.edu.co/72620/Tensors, an extension of arrays, are widely used in a variety of programming tasks. Tensors are the building blocks of many modern machine learning frameworks and are fundamental in the definition of deep learning models. Linters are indispensable tools for today's developers, as they help the developers to check code before executing it. Despite the popularity of tensors, linters for Python that check and flag code with tensors are nonexistent. Given the tremendous amount of work done in Python with (and without) tensors, it is quite baffling that little work has been done in this regard. Abstract Interpretation is a methodology/framework for statically analysing code. The idea of Abstract Interpretation is to soundly overapproximate the result of running a piece of code over all possible inputs of the program. A sound overapproximation ensures that the Abstract Interpreter will never omit a true negative, i.e.~if a piece of code is not flagged by the Abstract Interpreter, then it can be safely assumed that the code will not fail. The Abstract Interpreter can be modified so that it only outputs true positives, although losing soundness, i.e.~the interpreter can flag which parts of the code are going to fail no matter how the code is run. In this work, we specify a subset of Python with emphasis on tensor operations. Our operational Python semantics is based on The Python Language Reference. We define an Abstract Interpreter and present its implementation. We show how each part of the Abstract Interpreter was built: the Abstract Domains defined and the abstract semantics. We present the structure of Pytropos, the Abstract Interpreter implemented. Pytropos is able to check NumPy array operations taking into account broadcasting and complex NumPy functions as \texttt{array} and \texttt{dot}. We constructed 74 unit test cases checking the capabilities of Pytropos and 20 property test cases checking compliance with the official Python implementation. We show what and how many bugs the Abstract Interpreter was able to find.Resumen: Los tensores, una extensión de los arrays, se usan de manera extensiva en una gran variedad de problemas en programación. Los tensores son los bloques básicos de construcción de múltiples frameworks de Aprendizaje de Máquina y son fundamentales en la definición de modelos de Aprendizaje Profundo. Los linters son herramientas indispensables para los programadores de hoy en día, ya que estos ayudan a los desarrolladores a revisar el código antes de ejecutarlo. Aunque los tensores son muy populares no existen tensores que revisen código con operaciones tensoriales. Dada la gran cantidad de trabajo hecho en Python con (y sin) tensores, es sorprendente el poco trabajo que se ha hecho en esta área. La Interpretación Abstracta es una metodología/framework diseñada para analizar código de forma estática. La idea de Interpretación Abstracta es sobreaproximar de manera "sound" el resultado de ejecutar una pieza de código sobre todos las posibles entradas del programa. Una sobreaproximación "sound" asegura que el Interprete Abstracto nunca omitirá un verdadero negativo, es decir, si una pieza de código no es señalada como incorrecta por el Interprete Abstracto entonces se puede asumir con seguridad que el código nunca fallará. El Interprete Abstracto puede ser modificado para que sólo informe acerca de verdaderos positivos, aunque se pierda la propiedad de "soundness", es decir, el interprete sólo informa acerca de las partes de código que fallarán sin importar que suceda. En este trabajo, formalizamos un subconjunto de Python con énfasis en operaciones con tensores. Nuestra formalización de la semántica de Python está basada en la Referencia oficial del Lenguage Python. Definimos un Interprete Abstracto y presentamos su implementación. Mostramos como cada parte del Interprete Abstracto fué definido: su Dominio Abstracto y semántica abstracta. Presentamos la estructura de Pytropos, la implementación del Interprete Abstracto. Pytropos es capaz de revisar las operaciones de arreglos de NumPy teniendo en cuenta broadcasting y algunas funciones complejas de NumPy como \texttt{array} y \texttt{dot}. Construimos 74 casos de prueba unitaros, los cuales chequean la capacidad de Pytropos, además de 20 casos de prueba de propiedades, los cuales chequean que las reglas semánticas de Pytropos correspondan con la forma en la que Python es ejecutado. Mostramos las capacidades del Interprete Abstracto por medio de ejemplos de los test unitarios.Maestríaapplication/pdfspaUniversidad Nacional de Colombia Sede Bogotá Facultad de Ingeniería Departamento de Ingeniería de Sistemas e Industrial Ingeniería de SistemasIngeniería de Sistemas0 Generalidades / Computer science, information and general works51 Matemáticas / Mathematics62 Ingeniería y operaciones afines / EngineeringCruz Camacho, Elkin Alejandro (2019) Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis. Maestría thesis, Universidad Nacional de Colombia - Sede Bogotá.Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape AnalysisTrabajo de grado - Maestríainfo:eu-repo/semantics/masterThesisinfo:eu-repo/semantics/acceptedVersionTexthttp://purl.org/redcol/resource_type/TMTensorsLenguage PythonProgrammingTensoresLenguaje PythonProgramaciónORIGINALElkinCruz.2019.pdfapplication/pdf840142https://repositorio.unal.edu.co/bitstream/unal/76335/1/ElkinCruz.2019.pdf8e99ad38089d82738643e96f0ca2d5dfMD51pytropos.zipapplication/octet-stream605345https://repositorio.unal.edu.co/bitstream/unal/76335/2/pytropos.zip4adffa7f6a6b355ddccae96a75c5e25fMD52THUMBNAILElkinCruz.2019.pdf.jpgElkinCruz.2019.pdf.jpgGenerated Thumbnailimage/jpeg4413https://repositorio.unal.edu.co/bitstream/unal/76335/3/ElkinCruz.2019.pdf.jpgf8441b1bfa3b51e5797fca41dd6c72a6MD53unal/76335oai:repositorio.unal.edu.co:unal/763352024-07-12 23:31:49.578Repositorio Institucional Universidad Nacional de Colombiarepositorio_nal@unal.edu.co

Static Analysis of Python Programs using Abstract Interpretation: An Application to Tensor Shape Analysis

Publicaciones similares