Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer
Currently, several data sources drive the understanding of biological or clinical processes. Although their purpose is to assist in optimal decision-making, they require strategies that facilitate these data sources¿ integration. For example, in biological sciences, multi-omic data integration has i...
- Autores:
-
Salazar Barreto, Diego Armando
- Tipo de recurso:
- Doctoral thesis
- Fecha de publicación:
- 2022
- Institución:
- Universidad de los Andes
- Repositorio:
- Séneca: repositorio Uniandes
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.uniandes.edu.co:1992/59247
- Acceso en línea:
- http://hdl.handle.net/1992/59247
- Palabra clave:
- Multi-omic integration
Kernel trick
Causal inference
Targeted Learning
Machine Learning
Glioma
Breast cancer
Lung adenocarcinoma
Drug repurposing
Precision medicine
co-clustering
Joint Non-negative Matrix Factorization
Superlearner
data fusion
Ingeniería
- Rights
- openAccess
- License
- Atribución-NoComercial 4.0 Internacional
id |
UNIANDES2_8367dfe26a6a4d74d2b4e1e47a2172e9 |
---|---|
oai_identifier_str |
oai:repositorio.uniandes.edu.co:1992/59247 |
network_acronym_str |
UNIANDES2 |
network_name_str |
Séneca: repositorio Uniandes |
repository_id_str |
|
dc.title.none.fl_str_mv |
Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer |
title |
Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer |
spellingShingle |
Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer Multi-omic integration Kernel trick Causal inference Targeted Learning Machine Learning Glioma Breast cancer Lung adenocarcinoma Drug repurposing Precision medicine co-clustering Joint Non-negative Matrix Factorization Superlearner data fusion Ingeniería |
title_short |
Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer |
title_full |
Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer |
title_fullStr |
Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer |
title_full_unstemmed |
Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer |
title_sort |
Multi-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancer |
dc.creator.fl_str_mv |
Salazar Barreto, Diego Armando |
dc.contributor.advisor.none.fl_str_mv |
Valencia Arboleda, Carlos Felipe Díaz Muñoz, Iván Leonardo |
dc.contributor.author.none.fl_str_mv |
Salazar Barreto, Diego Armando |
dc.contributor.jury.none.fl_str_mv |
Duitama Castellanos, Jorge Alexander Przulj, Natasa Vallejo Ardila, Dora Lucía Flórez Vargas, Oscar |
dc.contributor.researchgroup.es_CO.fl_str_mv |
Centro para la Optimización y la Probabilidad Aplicada |
dc.subject.keyword.none.fl_str_mv |
Multi-omic integration Kernel trick Causal inference Targeted Learning Machine Learning Glioma Breast cancer Lung adenocarcinoma Drug repurposing Precision medicine co-clustering Joint Non-negative Matrix Factorization Superlearner data fusion |
topic |
Multi-omic integration Kernel trick Causal inference Targeted Learning Machine Learning Glioma Breast cancer Lung adenocarcinoma Drug repurposing Precision medicine co-clustering Joint Non-negative Matrix Factorization Superlearner data fusion Ingeniería |
dc.subject.themes.es_CO.fl_str_mv |
Ingeniería |
description |
Currently, several data sources drive the understanding of biological or clinical processes. Although their purpose is to assist in optimal decision-making, they require strategies that facilitate these data sources¿ integration. For example, in biological sciences, multi-omic data integration has improved the characterization of multiple types of cancers, which guarantees a better diagnosis and treatment. Therefore, integrating data can identify new drug targets and biomarkers, predict phenotypes or improve the design of observational clinical studies. This project aimed to contribute to the state of the art of multi-omics data integration methodologies by coupling various biological data sources (omic data and prior knowledge) using different machine learning algorithms. Our first contribution was to construct a strategy to integrate data sources from two cancer projects. We called this Multi-project and Multi-profile joint Non-negative Matrix Factorization (M&M-jNMF), which has clustering and predicting properties. Second, we applied a non-linear solution using kernels to the jNMF algorithm, which resulted in a more proper biological representation. Third, we proposed the M&M-jNMF based on kernels to improve the properties of this method. Finally, our last goal was to incorporate different multi-omic integration strategies into the Targeted Learning methodology to improve causal estimation and generate new advances in observational studies. |
publishDate |
2022 |
dc.date.accessioned.none.fl_str_mv |
2022-07-27T21:55:58Z |
dc.date.available.none.fl_str_mv |
2022-07-27T21:55:58Z |
dc.date.issued.none.fl_str_mv |
2022-06-30 |
dc.type.es_CO.fl_str_mv |
Trabajo de grado - Doctorado |
dc.type.driver.none.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
dc.type.version.none.fl_str_mv |
info:eu-repo/semantics/acceptedVersion |
dc.type.coar.none.fl_str_mv |
http://purl.org/coar/resource_type/c_db06 |
dc.type.content.es_CO.fl_str_mv |
Text |
dc.type.redcol.none.fl_str_mv |
https://purl.org/redcol/resource_type/TD |
format |
http://purl.org/coar/resource_type/c_db06 |
status_str |
acceptedVersion |
dc.identifier.uri.none.fl_str_mv |
http://hdl.handle.net/1992/59247 |
dc.identifier.doi.none.fl_str_mv |
10.57784/1992/59247 |
dc.identifier.instname.es_CO.fl_str_mv |
instname:Universidad de los Andes |
dc.identifier.reponame.es_CO.fl_str_mv |
reponame:Repositorio Institucional Séneca |
dc.identifier.repourl.es_CO.fl_str_mv |
repourl:https://repositorio.uniandes.edu.co/ |
url |
http://hdl.handle.net/1992/59247 |
identifier_str_mv |
10.57784/1992/59247 instname:Universidad de los Andes reponame:Repositorio Institucional Séneca repourl:https://repositorio.uniandes.edu.co/ |
dc.language.iso.es_CO.fl_str_mv |
eng |
language |
eng |
dc.rights.license.spa.fl_str_mv |
Atribución-NoComercial 4.0 Internacional |
dc.rights.uri.*.fl_str_mv |
http://creativecommons.org/licenses/by-nc/4.0/ |
dc.rights.accessrights.spa.fl_str_mv |
info:eu-repo/semantics/openAccess |
dc.rights.coar.spa.fl_str_mv |
http://purl.org/coar/access_right/c_abf2 |
rights_invalid_str_mv |
Atribución-NoComercial 4.0 Internacional http://creativecommons.org/licenses/by-nc/4.0/ http://purl.org/coar/access_right/c_abf2 |
eu_rights_str_mv |
openAccess |
dc.format.extent.es_CO.fl_str_mv |
129 paginas |
dc.format.mimetype.es_CO.fl_str_mv |
application/pdf |
dc.publisher.es_CO.fl_str_mv |
Universidad de los Andes |
dc.publisher.program.es_CO.fl_str_mv |
Doctorado en Ingeniería |
dc.publisher.faculty.es_CO.fl_str_mv |
Facultad de Ingeniería |
dc.publisher.department.es_CO.fl_str_mv |
Departamento de Ingeniería Industrial |
institution |
Universidad de los Andes |
bitstream.url.fl_str_mv |
https://repositorio.uniandes.edu.co/bitstreams/e55145ad-5bad-45fd-81a5-1303ce91e44c/download https://repositorio.uniandes.edu.co/bitstreams/fb3746a7-afdc-4fc5-8949-c24dfee3ced8/download https://repositorio.uniandes.edu.co/bitstreams/3c4867bd-aeec-422d-9b21-6c47168a781d/download https://repositorio.uniandes.edu.co/bitstreams/da809cfb-7c92-4a59-8779-80bec36a35c3/download https://repositorio.uniandes.edu.co/bitstreams/048755f2-2169-4d0a-af16-ac5717ae02b3/download https://repositorio.uniandes.edu.co/bitstreams/a9e01172-1a20-469c-82cc-a9994c89c66a/download https://repositorio.uniandes.edu.co/bitstreams/821d9553-5626-4db3-9fa7-ee03efc1cfde/download https://repositorio.uniandes.edu.co/bitstreams/2f500c7f-c19a-4d1d-b89a-f1ac8fac283b/download |
bitstream.checksum.fl_str_mv |
5aa5c691a1ffe97abd12c2966efcb8d6 a2f838efe1c9aae7a9dbe3378f48b26f f3eec2967285815caf7a5ce6e84c60ad 24013099e9e6abb1575dc6ce0855efd5 d80e45a330b9027fe495c5c6a4a6c9e0 4491fe1afb58beaaef41a73cf7ff2e27 b2c420a011f4b64ed2095e767b7ede98 9dc2d3e9d1529269d9060c4a21e34da5 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositorio institucional Séneca |
repository.mail.fl_str_mv |
adminrepositorio@uniandes.edu.co |
_version_ |
1812133921106690048 |
spelling |
Atribución-NoComercial 4.0 Internacionalhttp://creativecommons.org/licenses/by-nc/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Valencia Arboleda, Carlos Felipe7fd3265a-f28e-4682-941f-394ec6e3634d600Díaz Muñoz, Iván Leonardof7d10ed0-d8a3-4e17-b4fb-047ae435d644600Salazar Barreto, Diego Armando22a2c472-2515-4b12-b918-d19eb4c8e5b1600Duitama Castellanos, Jorge AlexanderPrzulj, NatasaVallejo Ardila, Dora LucíaFlórez Vargas, OscarCentro para la Optimización y la Probabilidad Aplicada2022-07-27T21:55:58Z2022-07-27T21:55:58Z2022-06-30http://hdl.handle.net/1992/5924710.57784/1992/59247instname:Universidad de los Andesreponame:Repositorio Institucional Sénecarepourl:https://repositorio.uniandes.edu.co/Currently, several data sources drive the understanding of biological or clinical processes. Although their purpose is to assist in optimal decision-making, they require strategies that facilitate these data sources¿ integration. For example, in biological sciences, multi-omic data integration has improved the characterization of multiple types of cancers, which guarantees a better diagnosis and treatment. Therefore, integrating data can identify new drug targets and biomarkers, predict phenotypes or improve the design of observational clinical studies. This project aimed to contribute to the state of the art of multi-omics data integration methodologies by coupling various biological data sources (omic data and prior knowledge) using different machine learning algorithms. Our first contribution was to construct a strategy to integrate data sources from two cancer projects. We called this Multi-project and Multi-profile joint Non-negative Matrix Factorization (M&M-jNMF), which has clustering and predicting properties. Second, we applied a non-linear solution using kernels to the jNMF algorithm, which resulted in a more proper biological representation. Third, we proposed the M&M-jNMF based on kernels to improve the properties of this method. Finally, our last goal was to incorporate different multi-omic integration strategies into the Targeted Learning methodology to improve causal estimation and generate new advances in observational studies.COLCIENCIAS convocatoria No. 785Doctor en IngenieríaDoctoradoHealth systems129 paginasapplication/pdfengUniversidad de los AndesDoctorado en IngenieríaFacultad de IngenieríaDepartamento de Ingeniería IndustrialMulti-omic data integration using joint non-negative matrix and machine learning methods for clinical endpoints prediction and causal parameter estimation in cancerTrabajo de grado - Doctoradoinfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/acceptedVersionhttp://purl.org/coar/resource_type/c_db06Texthttps://purl.org/redcol/resource_type/TDMulti-omic integrationKernel trickCausal inferenceTargeted LearningMachine LearningGliomaBreast cancerLung adenocarcinomaDrug repurposingPrecision medicineco-clusteringJoint Non-negative Matrix FactorizationSuperlearnerdata fusionIngeniería201628925PublicationLICENSElicense.txtlicense.txttext/plain; charset=utf-81810https://repositorio.uniandes.edu.co/bitstreams/e55145ad-5bad-45fd-81a5-1303ce91e44c/download5aa5c691a1ffe97abd12c2966efcb8d6MD51ORIGINALPhD_Thesis.pdfPhD_Thesis.pdfTesis de doctoradoapplication/pdf3203161https://repositorio.uniandes.edu.co/bitstreams/fb3746a7-afdc-4fc5-8949-c24dfee3ced8/downloada2f838efe1c9aae7a9dbe3378f48b26fMD53Repositorio_Tesis.pdfRepositorio_Tesis.pdfHIDEapplication/pdf185856https://repositorio.uniandes.edu.co/bitstreams/3c4867bd-aeec-422d-9b21-6c47168a781d/downloadf3eec2967285815caf7a5ce6e84c60adMD54CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8914https://repositorio.uniandes.edu.co/bitstreams/da809cfb-7c92-4a59-8779-80bec36a35c3/download24013099e9e6abb1575dc6ce0855efd5MD52TEXTPhD_Thesis.pdf.txtPhD_Thesis.pdf.txtExtracted texttext/plain233469https://repositorio.uniandes.edu.co/bitstreams/048755f2-2169-4d0a-af16-ac5717ae02b3/downloadd80e45a330b9027fe495c5c6a4a6c9e0MD55Repositorio_Tesis.pdf.txtRepositorio_Tesis.pdf.txtExtracted texttext/plain1163https://repositorio.uniandes.edu.co/bitstreams/a9e01172-1a20-469c-82cc-a9994c89c66a/download4491fe1afb58beaaef41a73cf7ff2e27MD57THUMBNAILPhD_Thesis.pdf.jpgPhD_Thesis.pdf.jpgIM Thumbnailimage/jpeg6809https://repositorio.uniandes.edu.co/bitstreams/821d9553-5626-4db3-9fa7-ee03efc1cfde/downloadb2c420a011f4b64ed2095e767b7ede98MD56Repositorio_Tesis.pdf.jpgRepositorio_Tesis.pdf.jpgIM Thumbnailimage/jpeg17305https://repositorio.uniandes.edu.co/bitstreams/2f500c7f-c19a-4d1d-b89a-f1ac8fac283b/download9dc2d3e9d1529269d9060c4a21e34da5MD581992/59247oai:repositorio.uniandes.edu.co:1992/592472024-08-26 15:22:56.127http://creativecommons.org/licenses/by-nc/4.0/open.accesshttps://repositorio.uniandes.edu.coRepositorio institucional Sénecaadminrepositorio@uniandes.edu.coWW8sIGVuIG1pIGNhbGlkYWQgZGUgYXV0b3IgZGVsIHRyYWJham8gZGUgdGVzaXMsIG1vbm9ncmFmw61hIG8gdHJhYmFqbyBkZSBncmFkbywgaGFnbyBlbnRyZWdhIGRlbCBlamVtcGxhciByZXNwZWN0aXZvIHkgZGUgc3VzIGFuZXhvcyBkZSBzZXIgZWwgY2FzbywgZW4gZm9ybWF0byBkaWdpdGFsIHkvbyBlbGVjdHLDs25pY28geSBhdXRvcml6byBhIGxhIFVuaXZlcnNpZGFkIGRlIGxvcyBBbmRlcyBwYXJhIHF1ZSByZWFsaWNlIGxhIHB1YmxpY2FjacOzbiBlbiBlbCBTaXN0ZW1hIGRlIEJpYmxpb3RlY2FzIG8gZW4gY3VhbHF1aWVyIG90cm8gc2lzdGVtYSBvIGJhc2UgZGUgZGF0b3MgcHJvcGlvIG8gYWplbm8gYSBsYSBVbml2ZXJzaWRhZCB5IHBhcmEgcXVlIGVuIGxvcyB0w6lybWlub3MgZXN0YWJsZWNpZG9zIGVuIGxhIExleSAyMyBkZSAxOTgyLCBMZXkgNDQgZGUgMTk5MywgRGVjaXNpw7NuIEFuZGluYSAzNTEgZGUgMTk5MywgRGVjcmV0byA0NjAgZGUgMTk5NSB5IGRlbcOhcyBub3JtYXMgZ2VuZXJhbGVzIHNvYnJlIGxhIG1hdGVyaWEsIHV0aWxpY2UgZW4gdG9kYXMgc3VzIGZvcm1hcywgbG9zIGRlcmVjaG9zIHBhdHJpbW9uaWFsZXMgZGUgcmVwcm9kdWNjacOzbiwgY29tdW5pY2FjacOzbiBww7pibGljYSwgdHJhbnNmb3JtYWNpw7NuIHkgZGlzdHJpYnVjacOzbiAoYWxxdWlsZXIsIHByw6lzdGFtbyBww7pibGljbyBlIGltcG9ydGFjacOzbikgcXVlIG1lIGNvcnJlc3BvbmRlbiBjb21vIGNyZWFkb3IgZGUgbGEgb2JyYSBvYmpldG8gZGVsIHByZXNlbnRlIGRvY3VtZW50by4gIAoKCkxhIHByZXNlbnRlIGF1dG9yaXphY2nDs24gc2UgZW1pdGUgZW4gY2FsaWRhZCBkZSBhdXRvciBkZSBsYSBvYnJhIG9iamV0byBkZWwgcHJlc2VudGUgZG9jdW1lbnRvIHkgbm8gY29ycmVzcG9uZGUgYSBjZXNpw7NuIGRlIGRlcmVjaG9zLCBzaW5vIGEgbGEgYXV0b3JpemFjacOzbiBkZSB1c28gYWNhZMOpbWljbyBkZSBjb25mb3JtaWRhZCBjb24gbG8gYW50ZXJpb3JtZW50ZSBzZcOxYWxhZG8uIExhIHByZXNlbnRlIGF1dG9yaXphY2nDs24gc2UgaGFjZSBleHRlbnNpdmEgbm8gc29sbyBhIGxhcyBmYWN1bHRhZGVzIHkgZGVyZWNob3MgZGUgdXNvIHNvYnJlIGxhIG9icmEgZW4gZm9ybWF0byBvIHNvcG9ydGUgbWF0ZXJpYWwsIHNpbm8gdGFtYmnDqW4gcGFyYSBmb3JtYXRvIGVsZWN0csOzbmljbywgeSBlbiBnZW5lcmFsIHBhcmEgY3VhbHF1aWVyIGZvcm1hdG8gY29ub2NpZG8gbyBwb3IgY29ub2Nlci4gCgoKRWwgYXV0b3IsIG1hbmlmaWVzdGEgcXVlIGxhIG9icmEgb2JqZXRvIGRlIGxhIHByZXNlbnRlIGF1dG9yaXphY2nDs24gZXMgb3JpZ2luYWwgeSBsYSByZWFsaXrDsyBzaW4gdmlvbGFyIG8gdXN1cnBhciBkZXJlY2hvcyBkZSBhdXRvciBkZSB0ZXJjZXJvcywgcG9yIGxvIHRhbnRvLCBsYSBvYnJhIGVzIGRlIHN1IGV4Y2x1c2l2YSBhdXRvcsOtYSB5IHRpZW5lIGxhIHRpdHVsYXJpZGFkIHNvYnJlIGxhIG1pc21hLiAKCgpFbiBjYXNvIGRlIHByZXNlbnRhcnNlIGN1YWxxdWllciByZWNsYW1hY2nDs24gbyBhY2Npw7NuIHBvciBwYXJ0ZSBkZSB1biB0ZXJjZXJvIGVuIGN1YW50byBhIGxvcyBkZXJlY2hvcyBkZSBhdXRvciBzb2JyZSBsYSBvYnJhIGVuIGN1ZXN0acOzbiwgZWwgYXV0b3IgYXN1bWlyw6EgdG9kYSBsYSByZXNwb25zYWJpbGlkYWQsIHkgc2FsZHLDoSBkZSBkZWZlbnNhIGRlIGxvcyBkZXJlY2hvcyBhcXXDrSBhdXRvcml6YWRvcywgcGFyYSB0b2RvcyBsb3MgZWZlY3RvcyBsYSBVbml2ZXJzaWRhZCBhY3TDumEgY29tbyB1biB0ZXJjZXJvIGRlIGJ1ZW5hIGZlLiAKCg== |