Hierarchical multi-label classification methods for gene function prediction
This dissertation studies the problem of predicting gene functions from a computational approach. The goal of this problem is to predict associations between genes and functions, where genes can be associated to multiple biological functions and functions have a hierarchical organization. Four machi...
- Autores:
-
Romero González , Miguel Ángel
- Tipo de recurso:
- Doctoral thesis
- Fecha de publicación:
- 2022
- Institución:
- Pontificia Universidad Javeriana Cali
- Repositorio:
- Vitela
- Idioma:
- eng
- OAI Identifier:
- oai:vitela.javerianacali.edu.co:11522/2088
- Acceso en línea:
- https://vitela.javerianacali.edu.co/handle/11522/2088
- Palabra clave:
- Rights
- License
- https://creativecommons.org/licenses/by-nc-nd/4.0/
id |
Vitela2_583553f6cde6b4880434dc4709f20f5e |
---|---|
oai_identifier_str |
oai:vitela.javerianacali.edu.co:11522/2088 |
network_acronym_str |
Vitela2 |
network_name_str |
Vitela |
repository_id_str |
|
dc.title.eng.fl_str_mv |
Hierarchical multi-label classification methods for gene function prediction |
title |
Hierarchical multi-label classification methods for gene function prediction |
spellingShingle |
Hierarchical multi-label classification methods for gene function prediction |
title_short |
Hierarchical multi-label classification methods for gene function prediction |
title_full |
Hierarchical multi-label classification methods for gene function prediction |
title_fullStr |
Hierarchical multi-label classification methods for gene function prediction |
title_full_unstemmed |
Hierarchical multi-label classification methods for gene function prediction |
title_sort |
Hierarchical multi-label classification methods for gene function prediction |
dc.creator.fl_str_mv |
Romero González , Miguel Ángel |
dc.contributor.advisor.none.fl_str_mv |
Rocha, Camilo Finke, Jorge |
dc.contributor.author.none.fl_str_mv |
Romero González , Miguel Ángel |
description |
This dissertation studies the problem of predicting gene functions from a computational approach. The goal of this problem is to predict associations between genes and functions, where genes can be associated to multiple biological functions and functions have a hierarchical organization. Four machine learning methods are developed focusing on different aspects of the problem, which has been modeled as a classification task: (a) considering hierarchical relations between functions to produce consistent predictions; (b) creating new data representations to built predictive models; (c) exploiting paths of functions in the hierarchy to detect missing annotations of genes; and (d) integrating information available for multiple organisms into the classification task. The main contributions of this work include novel methods that (i) overcome the limitations of the combinatorial gene function prediction problem; (ii) can be used to effectively identify associations between genes and functions of different organisms, including those that do not have enough data available to train predictive models; and (iii) help to narrow down the search space for in vivo experiments. These methods have been tested in efforts to predict gene functions in rice and maize, but have been formulated more generally and are applicable to any multi-label classification problem where the classes are organized into a hierarchy. |
publishDate |
2022 |
dc.date.issued.none.fl_str_mv |
2022 |
dc.date.accessioned.none.fl_str_mv |
2024-06-09T15:32:31Z |
dc.date.available.none.fl_str_mv |
2024-06-09T15:32:31Z |
dc.type.coar.none.fl_str_mv |
http://purl.org/coar/resource_type/c_db06 |
dc.type.local.none.fl_str_mv |
Tesis/Trabajo de grado - Monografía - Doctorado |
dc.type.redcol.none.fl_str_mv |
https://purl.org/redcol/resource_type/TD |
format |
http://purl.org/coar/resource_type/c_db06 |
dc.identifier.uri.none.fl_str_mv |
https://vitela.javerianacali.edu.co/handle/11522/2088 |
url |
https://vitela.javerianacali.edu.co/handle/11522/2088 |
dc.language.iso.none.fl_str_mv |
eng |
language |
eng |
dc.rights.uri.none.fl_str_mv |
https://creativecommons.org/licenses/by-nc-nd/4.0/ |
dc.rights.creativecommons.none.fl_str_mv |
https://creativecommons.org/licenses/by-nc-nd/4.0/ |
dc.rights.accessrights.none.fl_str_mv |
http://purl.org/coar/access_right/c_abf2 |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-nd/4.0/ http://purl.org/coar/access_right/c_abf2 |
dc.format.extent.none.fl_str_mv |
132 p. |
dc.format.mimetype.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Pontificia Universidad Javeriana Cali |
publisher.none.fl_str_mv |
Pontificia Universidad Javeriana Cali |
institution |
Pontificia Universidad Javeriana Cali |
bitstream.url.fl_str_mv |
https://vitela.javerianacali.edu.co/bitstreams/27ec399e-9537-4a23-9ed2-2b5baac433ae/download https://vitela.javerianacali.edu.co/bitstreams/9267409d-d6cd-46b0-b1dd-2ed121920932/download https://vitela.javerianacali.edu.co/bitstreams/3d255232-9a85-4a83-b6bb-82cc31d32b2d/download https://vitela.javerianacali.edu.co/bitstreams/3fb6f769-dcc6-4867-90a0-466804348292/download https://vitela.javerianacali.edu.co/bitstreams/77db32e7-a47b-4e07-b5aa-f41811897d7f/download https://vitela.javerianacali.edu.co/bitstreams/05cd92a5-f3cc-4e90-a435-5141407061e9/download https://vitela.javerianacali.edu.co/bitstreams/fcd75aee-6ff1-427b-83a0-947ab8d40c9a/download |
bitstream.checksum.fl_str_mv |
8a4605be74aa9ea9d79846c1fba20a33 707bb2e571e005aa5748acf38c7f7a1c 9bee94053383c448f8d6491140dc70e3 c69bfeb6aa70ab9b10b52cba0e88d46e d9e38ee46fb9c2ca6b0166a154f8a10b 5069dbf962fbfc09d7e4b1aeee07d6bc 3bef58f954a702760faa8b7b493c587e |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositorio Vitela |
repository.mail.fl_str_mv |
vitela.mail@javerianacali.edu.co |
_version_ |
1812095057343283200 |
spelling |
Rocha, CamiloFinke, JorgeRomero González , Miguel Ángel2024-06-09T15:32:31Z2024-06-09T15:32:31Z2022https://vitela.javerianacali.edu.co/handle/11522/2088132 p.application/pdfengPontificia Universidad Javeriana Calihttps://creativecommons.org/licenses/by-nc-nd/4.0/https://creativecommons.org/licenses/by-nc-nd/4.0/http://purl.org/coar/access_right/c_abf2Hierarchical multi-label classification methods for gene function predictionhttp://purl.org/coar/resource_type/c_db06Tesis/Trabajo de grado - Monografía - Doctoradohttps://purl.org/redcol/resource_type/TDThis dissertation studies the problem of predicting gene functions from a computational approach. The goal of this problem is to predict associations between genes and functions, where genes can be associated to multiple biological functions and functions have a hierarchical organization. Four machine learning methods are developed focusing on different aspects of the problem, which has been modeled as a classification task: (a) considering hierarchical relations between functions to produce consistent predictions; (b) creating new data representations to built predictive models; (c) exploiting paths of functions in the hierarchy to detect missing annotations of genes; and (d) integrating information available for multiple organisms into the classification task. The main contributions of this work include novel methods that (i) overcome the limitations of the combinatorial gene function prediction problem; (ii) can be used to effectively identify associations between genes and functions of different organisms, including those that do not have enough data available to train predictive models; and (iii) help to narrow down the search space for in vivo experiments. These methods have been tested in efforts to predict gene functions in rice and maize, but have been formulated more generally and are applicable to any multi-label classification problem where the classes are organized into a hierarchy.Facultad de Ingeniería y Ciencias. Doctorado en Ingeniería y Ciencias AplicadasPontificia Universidad Javeriana CaliDoctoradoLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://vitela.javerianacali.edu.co/bitstreams/27ec399e-9537-4a23-9ed2-2b5baac433ae/download8a4605be74aa9ea9d79846c1fba20a33MD51ORIGINALMiguelRomero_Tesis.pdfMiguelRomero_Tesis.pdfapplication/pdf1802057https://vitela.javerianacali.edu.co/bitstreams/9267409d-d6cd-46b0-b1dd-2ed121920932/download707bb2e571e005aa5748acf38c7f7a1cMD52Licencia_autorizacion_biblioteca.docx(1).pdfLicencia_autorizacion_biblioteca.docx(1).pdfapplication/pdf119268https://vitela.javerianacali.edu.co/bitstreams/3d255232-9a85-4a83-b6bb-82cc31d32b2d/download9bee94053383c448f8d6491140dc70e3MD53TEXTMiguelRomero_Tesis.pdf.txtMiguelRomero_Tesis.pdf.txtExtracted texttext/plain100692https://vitela.javerianacali.edu.co/bitstreams/3fb6f769-dcc6-4867-90a0-466804348292/downloadc69bfeb6aa70ab9b10b52cba0e88d46eMD511Licencia_autorizacion_biblioteca.docx(1).pdf.txtLicencia_autorizacion_biblioteca.docx(1).pdf.txtExtracted texttext/plain4748https://vitela.javerianacali.edu.co/bitstreams/77db32e7-a47b-4e07-b5aa-f41811897d7f/downloadd9e38ee46fb9c2ca6b0166a154f8a10bMD513THUMBNAILMiguelRomero_Tesis.pdf.jpgMiguelRomero_Tesis.pdf.jpgGenerated Thumbnailimage/jpeg4041https://vitela.javerianacali.edu.co/bitstreams/05cd92a5-f3cc-4e90-a435-5141407061e9/download5069dbf962fbfc09d7e4b1aeee07d6bcMD512Licencia_autorizacion_biblioteca.docx(1).pdf.jpgLicencia_autorizacion_biblioteca.docx(1).pdf.jpgGenerated Thumbnailimage/jpeg5193https://vitela.javerianacali.edu.co/bitstreams/fcd75aee-6ff1-427b-83a0-947ab8d40c9a/download3bef58f954a702760faa8b7b493c587eMD51411522/2088oai:vitela.javerianacali.edu.co:11522/20882024-06-25 05:13:51.587https://creativecommons.org/licenses/by-nc-nd/4.0/open.accesshttps://vitela.javerianacali.edu.coRepositorio Vitelavitela.mail@javerianacali.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo= |