Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer
Currently, several data sources drive the understanding of biological or clinical processes. Although their purpose is to assist in optimal decision-making, they require strategies that facilitate these data sources¿ integration. For example, in biological sciences, multi-omic data integration has i...
- Autores:
-
Salazar Barreto, Diego Armando
- Tipo de recurso:
- Doctoral thesis
- Fecha de publicación:
- 2022
- Institución:
- Universidad de los Andes
- Repositorio:
- Séneca: repositorio Uniandes
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.uniandes.edu.co:1992/59247
- Acceso en línea:
- http://hdl.handle.net/1992/59247
- Palabra clave:
- Multi-omic integration
Kernel trick
Causal inference
Targeted Learning
Machine Learning
Glioma
Breast cancer
Lung adenocarcinoma
Drug repurposing
Precision medicine
co-clustering
Joint Non-negative Matrix Factorization
Superlearner
data fusion
Ingeniería
- Rights
- openAccess
- License
- Atribución-NoComercial 4.0 Internacional
Summary: | Currently, several data sources drive the understanding of biological or clinical processes. Although their purpose is to assist in optimal decision-making, they require strategies that facilitate these data sources¿ integration. For example, in biological sciences, multi-omic data integration has improved the characterization of multiple types of cancers, which guarantees a better diagnosis and treatment. Therefore, integrating data can identify new drug targets and biomarkers, predict phenotypes or improve the design of observational clinical studies. This project aimed to contribute to the state of the art of multi-omics data integration methodologies by coupling various biological data sources (omic data and prior knowledge) using different machine learning algorithms. Our first contribution was to construct a strategy to integrate data sources from two cancer projects. We called this Multi-project and Multi-profile joint Non-negative Matrix Factorization (M&M-jNMF), which has clustering and predicting properties. Second, we applied a non-linear solution using kernels to the jNMF algorithm, which resulted in a more proper biological representation. Third, we proposed the M&M-jNMF based on kernels to improve the properties of this method. Finally, our last goal was to incorporate different multi-omic integration strategies into the Targeted Learning methodology to improve causal estimation and generate new advances in observational studies. |
---|