Linear methods of dimension reduction for classification

For classification problems, traditional dimension reduction methods often take into account only the feature information, while ignoring the class label. This poses an opportunity for improvement. In this thesis, we explore new methods that aim to find linear orthogonal projections that maximize op...

Full description

Autores:
Ramírez Garrido, Diego Alejandro
Tipo de recurso:
Trabajo de grado de pregrado
Fecha de publicación:
2024
Institución:
Universidad de los Andes
Repositorio:
Séneca: repositorio Uniandes
Idioma:
eng
OAI Identifier:
oai:repositorio.uniandes.edu.co:1992/75195
Acceso en línea:
https://hdl.handle.net/1992/75195
Palabra clave:
Dimension Reduction
Dimensionality Reduction
Wasserstein Distance
Sinkhorn Divergence
Subgradient Descent
Binary Classification
Optimal Transport
Matemáticas
Rights
openAccess
License
Attribution 4.0 International
Description
Summary:For classification problems, traditional dimension reduction methods often take into account only the feature information, while ignoring the class label. This poses an opportunity for improvement. In this thesis, we explore new methods that aim to find linear orthogonal projections that maximize optimal transportation distances (Wasserstein and Sinkhorn) feature subsamples corresponding to the categories. These methods employ ubgradient ascent and stochastic subgradient ascent algorithms. We detail the calculation of the subgradient of these distances with respect to the projection and implement these methods in Python. To validate our approach, we test these methods on various datasets. Our results demonstrate that the proposed methods effectively enhance classification performance by incorporating class information into the dimension reduction process.