Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer

Currently, several data sources drive the understanding of biological or clinical processes. Although their purpose is to assist in optimal decision-making, they require strategies that facilitate these data sources¿ integration. For example, in biological sciences, multi-omic data integration has i...

Full description

Autores:: Salazar Barreto, Diego Armando

Tipo de recurso:: Doctoral thesis

Fecha de publicación:: 2022

Institución:: Universidad de los Andes

Repositorio:: Séneca: repositorio Uniandes

Idioma:: eng

Description
Summary:	Currently, several data sources drive the understanding of biological or clinical processes. Although their purpose is to assist in optimal decision-making, they require strategies that facilitate these data sources¿ integration. For example, in biological sciences, multi-omic data integration has improved the characterization of multiple types of cancers, which guarantees a better diagnosis and treatment. Therefore, integrating data can identify new drug targets and biomarkers, predict phenotypes or improve the design of observational clinical studies. This project aimed to contribute to the state of the art of multi-omics data integration methodologies by coupling various biological data sources (omic data and prior knowledge) using different machine learning algorithms. Our first contribution was to construct a strategy to integrate data sources from two cancer projects. We called this Multi-project and Multi-profile joint Non-negative Matrix Factorization (M&M-jNMF), which has clustering and predicting properties. Second, we applied a non-linear solution using kernels to the jNMF algorithm, which resulted in a more proper biological representation. Third, we proposed the M&M-jNMF based on kernels to improve the properties of this method. Finally, our last goal was to incorporate different multi-omic integration strategies into the Targeted Learning methodology to improve causal estimation and generate new advances in observational studies.

Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer

Publicaciones similares