Validating subcellular localization prediction tools with mycobacterial proteins

Background: The computational prediction of mycobacterial proteins' subcellular localization is of key importance for proteome annotation and for the identification of new drug targets and vaccine candidates. Several subcellular localization classifiers have been developed over the past few yea...

Full description

Autores:
Tipo de recurso:
Fecha de publicación:
2009
Institución:
Universidad del Rosario
Repositorio:
Repositorio EdocUR - U. Rosario
Idioma:
eng
OAI Identifier:
oai:repository.urosario.edu.co:10336/21913
Acceso en línea:
https://doi.org/10.1186/1471-2105-10-134
https://repository.urosario.edu.co/handle/10336/21913
Palabra clave:
Microbiología
Computational predictions
Computational tools
Predictive performance
Bacterial Proteins
Predictive performance
Mycobacterium
Bacteria (microorganisms)
Rights
License
Abierto (Texto Completo)
id EDOCUR2_b4df60c8bcbec0b99390abda7e677690
oai_identifier_str oai:repository.urosario.edu.co:10336/21913
network_acronym_str EDOCUR2
network_name_str Repositorio EdocUR - U. Rosario
repository_id_str
dc.title.spa.fl_str_mv Validating subcellular localization prediction tools with mycobacterial proteins
title Validating subcellular localization prediction tools with mycobacterial proteins
spellingShingle Validating subcellular localization prediction tools with mycobacterial proteins
Microbiología
Computational predictions
Computational tools
Predictive performance
Bacterial Proteins
Predictive performance
Mycobacterium
Bacteria (microorganisms)
title_short Validating subcellular localization prediction tools with mycobacterial proteins
title_full Validating subcellular localization prediction tools with mycobacterial proteins
title_fullStr Validating subcellular localization prediction tools with mycobacterial proteins
title_full_unstemmed Validating subcellular localization prediction tools with mycobacterial proteins
title_sort Validating subcellular localization prediction tools with mycobacterial proteins
dc.subject.ddc.spa.fl_str_mv Microbiología
topic Microbiología
Computational predictions
Computational tools
Predictive performance
Bacterial Proteins
Predictive performance
Mycobacterium
Bacteria (microorganisms)
dc.subject.keyword.spa.fl_str_mv Computational predictions
Computational tools
Predictive performance
Bacterial Proteins
Predictive performance
Mycobacterium
Bacteria (microorganisms)
description Background: The computational prediction of mycobacterial proteins' subcellular localization is of key importance for proteome annotation and for the identification of new drug targets and vaccine candidates. Several subcellular localization classifiers have been developed over the past few years, which have comprised both general localization and feature-based classifiers. Here, we have validated the ability of different bioinformatics approaches, through the use of SignalP 2.0, TatP 1.0, LipoP 1.0, Phobius, PA-SUB 2.5, PSORTb v.2.0.4 and Gpos-PLoc, to predict secreted bacterial proteins. These computational tools were compared in terms of sensitivity, specificity and Matthew's correlation coefficient (MCC) using a set of mycobacterial proteins having less than 40% identity, none of which are included in the training data sets of the validated tools and whose subcellular localization have been experimentally confirmed. These proteins belong to the TBpred training data set, a computational tool specifically designed to predict mycobacterial proteins. Results: A final validation set of 272 mycobacterial proteins was obtained from the initial set of 852 mycobacterial proteins. According to the results of the validation metrics, all tools presented specificity above 0.90, while dispersion sensitivity and MCC values were above 0.22. PA-SUB 2.5 presented the highest values; however, these results might be biased due to the methodology used by this tool. PSORTb v.2.0.4 left 56 proteins out of the classification, while Gpos-PLoc left just one protein out. Conclusion: Both subcellular localization approaches had high predictive specificity and high recognition of true negatives for the tested data set. Among those tools whose predictions are not based on homology searches against SWISS-PROT, Gpos-PLoc was the general localization tool with the best predictive performance, while SignalP 2.0 was the best tool among the ones using a feature-based approach. Even though PA-SUB 2.5 presented the highest metrics, it should be taken into account that this tool was trained using all proteins reported in SWISS-PROT, which includes the protein set tested in this study, either as a BLAST search or as a training model. © 2009 Restrepo-Montoya et al; licensee BioMed Central Ltd.
publishDate 2009
dc.date.created.none.fl_str_mv 2009
dc.date.issued.none.fl_str_mv 2009
dc.date.accessioned.none.fl_str_mv 2020-05-08T03:41:27Z
dc.date.available.none.fl_str_mv 2020-05-08T03:41:27Z
dc.type.eng.fl_str_mv article
dc.type.coarversion.fl_str_mv http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.coar.fl_str_mv http://purl.org/coar/resource_type/c_6501
dc.type.spa.spa.fl_str_mv Artículo
dc.identifier.doi.none.fl_str_mv https://doi.org/10.1186/1471-2105-10-134
dc.identifier.issn.none.fl_str_mv 1471-2105
dc.identifier.uri.none.fl_str_mv https://repository.urosario.edu.co/handle/10336/21913
url https://doi.org/10.1186/1471-2105-10-134
https://repository.urosario.edu.co/handle/10336/21913
identifier_str_mv 1471-2105
dc.language.iso.spa.fl_str_mv eng
language eng
dc.relation.citationTitle.none.fl_str_mv BMC Bioinformatics
dc.relation.citationVolume.none.fl_str_mv Vol. 10
dc.relation.ispartof.spa.fl_str_mv BMC Bioinformatics, ISSN: 1471-2105 Vol. 10, (2009)
dc.relation.uri.spa.fl_str_mv https://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/1471-2105-10-134
dc.rights.coar.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.rights.acceso.spa.fl_str_mv Abierto (Texto Completo)
rights_invalid_str_mv Abierto (Texto Completo)
http://purl.org/coar/access_right/c_abf2
dc.format.mimetype.none.fl_str_mv application/pdf
institution Universidad del Rosario
dc.source.instname.none.fl_str_mv instname:Universidad del Rosario
dc.source.reponame.none.fl_str_mv reponame:Repositorio Institucional EdocUR
bitstream.url.fl_str_mv https://repository.urosario.edu.co/bitstreams/68b64486-67b7-415d-ae5d-14a341c5e7c3/download
https://repository.urosario.edu.co/bitstreams/ddccfdd1-fcb4-4629-be02-2265c21fff9c/download
https://repository.urosario.edu.co/bitstreams/82e7c88a-7f76-47af-9b2d-0783e583bfe2/download
bitstream.checksum.fl_str_mv 12158a6ad7e5bfe0061657d29646cb05
e48392db17ecb302ee1a461c5884a6f9
b4f87f0bce423ef0fe0daa80af17efe3
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositorio institucional EdocUR
repository.mail.fl_str_mv edocur@urosario.edu.co
_version_ 1808390955438440448
spelling ea367fd3-bbe1-4a83-a43c-43c368b594d26009de1cfc2-5d95-4925-a819-b9b2b20ff2d2600f1992b30-16ca-49f4-b4e8-998341f50042600518488266005f93fb9e-84e5-4c2e-8d87-5455ab50cd8c600796530656002020-05-08T03:41:27Z2020-05-08T03:41:27Z20092009Background: The computational prediction of mycobacterial proteins' subcellular localization is of key importance for proteome annotation and for the identification of new drug targets and vaccine candidates. Several subcellular localization classifiers have been developed over the past few years, which have comprised both general localization and feature-based classifiers. Here, we have validated the ability of different bioinformatics approaches, through the use of SignalP 2.0, TatP 1.0, LipoP 1.0, Phobius, PA-SUB 2.5, PSORTb v.2.0.4 and Gpos-PLoc, to predict secreted bacterial proteins. These computational tools were compared in terms of sensitivity, specificity and Matthew's correlation coefficient (MCC) using a set of mycobacterial proteins having less than 40% identity, none of which are included in the training data sets of the validated tools and whose subcellular localization have been experimentally confirmed. These proteins belong to the TBpred training data set, a computational tool specifically designed to predict mycobacterial proteins. Results: A final validation set of 272 mycobacterial proteins was obtained from the initial set of 852 mycobacterial proteins. According to the results of the validation metrics, all tools presented specificity above 0.90, while dispersion sensitivity and MCC values were above 0.22. PA-SUB 2.5 presented the highest values; however, these results might be biased due to the methodology used by this tool. PSORTb v.2.0.4 left 56 proteins out of the classification, while Gpos-PLoc left just one protein out. Conclusion: Both subcellular localization approaches had high predictive specificity and high recognition of true negatives for the tested data set. Among those tools whose predictions are not based on homology searches against SWISS-PROT, Gpos-PLoc was the general localization tool with the best predictive performance, while SignalP 2.0 was the best tool among the ones using a feature-based approach. Even though PA-SUB 2.5 presented the highest metrics, it should be taken into account that this tool was trained using all proteins reported in SWISS-PROT, which includes the protein set tested in this study, either as a BLAST search or as a training model. © 2009 Restrepo-Montoya et al; licensee BioMed Central Ltd.application/pdfhttps://doi.org/10.1186/1471-2105-10-1341471-2105https://repository.urosario.edu.co/handle/10336/21913engBMC BioinformaticsVol. 10BMC Bioinformatics, ISSN: 1471-2105 Vol. 10, (2009)https://bmcbioinformatics.biomedcentral.com/track/pdf/10.1186/1471-2105-10-134Abierto (Texto Completo)http://purl.org/coar/access_right/c_abf2instname:Universidad del Rosarioreponame:Repositorio Institucional EdocURMicrobiología576600Computational predictionsComputational toolsPredictive performanceBacterial ProteinsPredictive performanceMycobacteriumBacteria (microorganisms)Validating subcellular localization prediction tools with mycobacterial proteinsarticleArtículohttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/resource_type/c_6501Restrepo-Montoya, DanielVizcaíno, CarolinaNiño, Luis F.Ocampo, MarisolPatarroyo, Manuel-ElkinPatarroyo, Manuel A.Restrepo-Montoya, DanielVizcaíno, CarolinaNiño, Luis FOcampo, MarisolPatarroyo, Manuel EPatarroyo, Manuel AORIGINALValidating_subcellular_localization.pdfapplication/pdf567152https://repository.urosario.edu.co/bitstreams/68b64486-67b7-415d-ae5d-14a341c5e7c3/download12158a6ad7e5bfe0061657d29646cb05MD51TEXTValidating_subcellular_localization.pdf.txtValidating_subcellular_localization.pdf.txtExtracted texttext/plain42064https://repository.urosario.edu.co/bitstreams/ddccfdd1-fcb4-4629-be02-2265c21fff9c/downloade48392db17ecb302ee1a461c5884a6f9MD52THUMBNAILValidating_subcellular_localization.pdf.jpgValidating_subcellular_localization.pdf.jpgGenerated Thumbnailimage/jpeg4546https://repository.urosario.edu.co/bitstreams/82e7c88a-7f76-47af-9b2d-0783e583bfe2/downloadb4f87f0bce423ef0fe0daa80af17efe3MD5310336/21913oai:repository.urosario.edu.co:10336/219132020-05-13 14:49:05.306https://repository.urosario.edu.coRepositorio institucional EdocURedocur@urosario.edu.co