Descripción de escenas por medio de aprendizaje profundo

This document contains information related with the Project developed for the reproduction of results of image captioning or image description, this previous Works were realized by experts in the field The work was done on the free software platform Python using the TensorFlow library, which ended w...

Full description

Autores:
Rincón Núñez, Adalberto
Tipo de recurso:
Trabajo de grado de pregrado
Fecha de publicación:
2018
Institución:
Universidad Autónoma de Occidente
Repositorio:
RED: Repositorio Educativo Digital UAO
Idioma:
spa
OAI Identifier:
oai:red.uao.edu.co:10614/10557
Acceso en línea:
http://hdl.handle.net/10614/10557
Palabra clave:
Ingeniería Mecatrónica
Redes neurales (Computadores)
Descripción de imágenes
Rights
openAccess
License
Derechos Reservados - Universidad Autónoma de Occidente
Description
Summary:This document contains information related with the Project developed for the reproduction of results of image captioning or image description, this previous Works were realized by experts in the field The work was done on the free software platform Python using the TensorFlow library, which ended with the training of a recurrent neural network whose function is to receive images as input and output an image file equal to the input with a written description of what you see in the image. For the training, two different types of data sets were required, the first one covering a wide range of images and the second one providing the description of them; for this, the available data sets were taken on the COCO platform (Common Objects in Context), which is a broad database for detection, segmentation and description. The databases used were those provided for the "COCO captioning challenge" a contest held in 2015. These datasets are pre-processed, which for the images is the proper extraction of characteristics of the images, this process is performed by means of a convolutional neural network (CNN), in this case the Inception network was used, the characteristics were subsequently stored in a file that was used for the training of the RNN. The document also shows information about the process realized in the Project that allows the use of the codes to train a red wich one describe images in spanish; following that, the Project conclude with the training of a red that can describes selected enviroments, with the condition of creating new data sets. At the end, tests were carried out on the several networks, where satisfactory results were found as descriptions were provided according to what was supplied, and also in some of the test the description show fails, like the gender of the person or the color of objects that the picture contains