Multimodal information spaces for content-based image retrieval

Abstract. Image collections today are increasingly larger in size, and they continue to grow constantly. Without the help of image search systems these abundant visual records collected in many different fields and domains may remain unused and inaccessible. Many available image databases often cont...

Full description

Autores:
Caicedo Rueda, Juan Carlos
Tipo de recurso:
Doctoral thesis
Fecha de publicación:
2012
Institución:
Universidad Nacional de Colombia
Repositorio:
Universidad Nacional de Colombia
Idioma:
spa
OAI Identifier:
oai:repositorio.unal.edu.co:unal/20154
Acceso en línea:
https://repositorio.unal.edu.co/handle/unal/20154
http://bdigital.unal.edu.co/10591/
Palabra clave:
0 Generalidades / Computer science, information and general works
51 Matemáticas / Mathematics
62 Ingeniería y operaciones afines / Engineering
Image databases
Indexing methods
Image search
Multimodal data
Analysis
Machine learning
Pattern recognition
Matrix factorization
Bases de datos de imágenes
Métodos de indexación
Búsqueda de imágenes
Análisis de datos multimodal
Aprendizaje de máquina
Reconocimiento de patrones
Factorización de matrices
Rights
openAccess
License
Atribución-NoComercial 4.0 Internacional
Description
Summary:Abstract. Image collections today are increasingly larger in size, and they continue to grow constantly. Without the help of image search systems these abundant visual records collected in many different fields and domains may remain unused and inaccessible. Many available image databases often contain complementary modalities, such as attached text resources, which can be used to build an index for querying with keywords. However, sometimes users do not have or do not know the right words to express what they need, and, in addition, keywords do not express all the visual variations that an image may contain. Using example images as queries can be viewed as an alternative in different scenarios such as searching images using a mobile phone with a coupled camera, or supporting medical diagnosis by searching a large medical image collection. Still, matching only visual features between the query and image databases may lead to undesirable results from the user's perspective. These conditions make the process of finding relevant images for a specific information need very challenging, time consuming or even frustrating. Instead of considering only a single data modality to build image search indexes, the simultaneous use of both, visual and text data modalities, has been suggested. Non-visual information modalities may provide complementary information to enrich the image representation. The goal of this research work is to study the relationships between visual contents and text terms to build useful indexes for image search. A family of algorithms based on matrix factorization are proposed for extracting the multimodal aspects from an image collection. Using this knowledge about how visual features and text terms correlate, a search index is constructed, which can be searched using keywords, example images or combinations of both. Systematic experiments were conducted on different data sets to evaluate the proposed indexing algorithms. The experimental results showed that multimodal indexing is an effective strategy for designing image search systems.