Inter-Battery Factor Analysis via PLS: The Missing Data Case

In this article we develop the Inter-battery Factor Analysis (IBA) by using PLS (Partial Least Squares) methods. As the PLS methods are algorithms that iterate until convergence, an adequate intervention in some of their stages provides a solution to problems such as missing data. Specifically, we t...

Full description

Autores:
Gonzalez Rojas, Victor Manuel
Tipo de recurso:
Article of journal
Fecha de publicación:
2016
Institución:
Universidad Nacional de Colombia
Repositorio:
Universidad Nacional de Colombia
Idioma:
spa
OAI Identifier:
oai:repositorio.unal.edu.co:unal/66513
Acceso en línea:
https://repositorio.unal.edu.co/handle/unal/66513
http://bdigital.unal.edu.co/67541/
Palabra clave:
51 Matemáticas / Mathematics
31 Colecciones de estadística general / Statistics
Interbattery
IBA
PLS2
NIPALS
algorithm
convergence
missing data.
Algoritmo
Convergencia
Datos faltantes
Regresión con mínimos cuadrados parciales.
Rights
openAccess
License
Atribución-NoComercial 4.0 Internacional
Description
Summary:In this article we develop the Inter-battery Factor Analysis (IBA) by using PLS (Partial Least Squares) methods. As the PLS methods are algorithms that iterate until convergence, an adequate intervention in some of their stages provides a solution to problems such as missing data. Specifically, we take the iterative stage of the PLS regression and implement the "available data'' principle from the NIPALS (Non-linear estimation by Iterative Partial Least Squares) algorithm to allow the algorithmic development of the IBA with missing data. We provide the basic elements to correctly analyse and interpret the results. This new algorithm for IBA, developed under the R programming environment, fundamentally executes iterative convergent sequences of orthogonal projections of vectors coupled with the available data, and works adequately in bases with or without missing data.To present the basic concepts of the IBA and to cross-reference the results derived from the algorithmic application, we use the complete Linnerud database for the classical analysis; then we contaminate this database with a random sample that represents approximately 7\% of the \textit{non-available} (NA) data for the analysis with missing data. We ascertain that the results obtained from the algorithm running with complete data are exactly the same as those obtained from the classic method for IBA, and that the results with missing data are similar. However, this might not always be the case, as it depends on how much the 'original' factorial covariance structure is affected by the absence of information. As such, the interpretation is only valid in relation to the available data.