Robust unsupervised learning using kernels

This thesis aims to study deep connections between statistical robustness and machine learning techniques, in particular, the relationship between some particular kernel (the Gaussian kernel) and the robustness of kernel-based learning methods that use it. This thesis also presented that estimating...

Full description

Autores:
Gallego Mejia, Joseph Alejandro
Tipo de recurso:
Fecha de publicación:
2017
Institución:
Universidad Nacional de Colombia
Repositorio:
Universidad Nacional de Colombia
Idioma:
spa
OAI Identifier:
oai:repositorio.unal.edu.co:unal/59937
Acceso en línea:
https://repositorio.unal.edu.co/handle/unal/59937
http://bdigital.unal.edu.co/57770/
Palabra clave:
62 Ingeniería y operaciones afines / Engineering
Machine Learning
Dimensionality Reduction
Unsupervised Learning
Kernel Learning Approach
Robust Statistics
Welsch Estimator
Aprendizaje de máquina
Reducción de la dimensionalidad
Aprendizaje con métodos de Kernel
Estadística robusta
Eestimadorde Welsch.
Rights
openAccess
License
Atribución-NoComercial 4.0 Internacional
Description
Summary:This thesis aims to study deep connections between statistical robustness and machine learning techniques, in particular, the relationship between some particular kernel (the Gaussian kernel) and the robustness of kernel-based learning methods that use it. This thesis also presented that estimating the mean in the feature space with the RBF kernel, is like doing robust estimation of the mean in the data space with the Welsch M-estimator. Based on these ideas, new robust kernel to machine learning algorithms are designed and implemented in the current thesis: Tukey’s, Andrews’ and Huber’s robust kernels which each one corresponding to Tukey’s, Andrews’ and Huber’s M-robust estimator, respectively. On the one hand, kernel-based algorithms are an important tool which is widely applied to different machine learning and information retrieval problems including: clustering, latent topic analysis, recommender systems, image annotation, and contentbased image retrieval, amongst others. Robustness is the ability of a statistical estimation method or machine learning method to deal with noise and outliers. There is a strong theory of robustness in statistics; however, it receives little attention in machine learning. A systematic evaluation is performed in order to evaluate the robustness of kernel-based algorithms in clustering showing that some robust kernels including Tukey’s and Andrews’ robust kernels perform on par to state-of-the-art algorithms