NClassG+ : A classifier for non-classically secreted Gram-positive bacterial proteins

Background: Most predictive methods currently available for the identification of protein secretion mechanisms have focused on classically secreted proteins. In fact, only two methods have been reported for predicting non-classically secreted proteins of Gram-positive bacteria. This study describes...

Full description

Autores:
Tipo de recurso:
Fecha de publicación:
2011
Institución:
Universidad del Rosario
Repositorio:
Repositorio EdocUR - U. Rosario
Idioma:
eng
OAI Identifier:
oai:repository.urosario.edu.co:10336/21891
Acceso en línea:
https://doi.org/10.1186/1471-2105-12-21
https://repository.urosario.edu.co/handle/10336/21891
Palabra clave:
Enfermedades
Microbiología
Support vector machine
Dipeptide
Matthews correlation coefficient
Gaussian Kernel function
Rights
License
Abierto (Texto Completo)
id EDOCUR2_b28ea664a282dcd97be58d58a340ad01
oai_identifier_str oai:repository.urosario.edu.co:10336/21891
network_acronym_str EDOCUR2
network_name_str Repositorio EdocUR - U. Rosario
repository_id_str
spelling ea367fd3-bbe1-4a83-a43c-43c368b594d26008496bc6d-45bb-4e62-8701-a1e61a44bfe4600e4b0e3ed-4bf0-4ee1-95bd-cda103ca327e600427c155f-77b4-4377-b36e-706c1940b3dc600796530656002020-05-07T13:44:02Z2020-05-07T13:44:02Z20112011Background: Most predictive methods currently available for the identification of protein secretion mechanisms have focused on classically secreted proteins. In fact, only two methods have been reported for predicting non-classically secreted proteins of Gram-positive bacteria. This study describes the implementation of a sequence-based classifier, denoted as NClassG+, for identifying non-classically secreted Gram-positive bacterial proteins.Results: Several feature-based classifiers were trained using different sequence transformation vectors (frequencies, dipeptides, physicochemical factors and PSSM) and Support Vector Machines (SVMs) with Linear, Polynomial and Gaussian kernel functions. Nested k-fold cross-validation (CV) was applied to select the best models, using the inner CV loop to tune the model parameters and the outer CV group to compute the error. The parameters and Kernel functions and the combinations between all possible feature vectors were optimized using grid search.Conclusions: The final model was tested against an independent set not previously seen by the model, obtaining better predictive performance compared to SecretomeP V2.0 and SecretPV2.0 for the identification of non-classically secreted proteins. NClassG+ is freely available on the web at http://www.biolisi.unal.edu.co/web-servers/nclassgpositive/. © 2011 Restrepo-Montoya et al; licensee BioMed Central Ltd.application/pdfhttps://doi.org/10.1186/1471-2105-12-211471-2105https://repository.urosario.edu.co/handle/10336/21891engBMC BioinformaticsVol. 12BMC Bioinformatics, ISSN: 1471-2105 Vol. 12, (2011)https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-21Abierto (Texto Completo)http://purl.org/coar/access_right/c_abf2instname:Universidad del Rosarioreponame:Repositorio Institucional EdocUREnfermedades616600Microbiología576600Support vector machineDipeptideMatthews correlation coefficientGaussian Kernel functionNClassG+ : A classifier for non-classically secreted Gram-positive bacterial proteinsarticleArtículohttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/resource_type/c_6501Restrepo-Montoya, DanielPino, CamiloNino, Luis FPatarroyo, Manuel-ElkinPatarroyo, Manuel A.Restrepo-Montoya, DanielPino, CamiloNino, Luis FPatarroyo, Manuel EPatarroyo, Manuel AORIGINALA_classifier_for_non-classically.pdfapplication/pdf661953https://repository.urosario.edu.co/bitstreams/ae8edfe3-743a-4bf6-b0ed-631fff895e34/downloada8701723a64f12afb336cfb7a2bd2333MD51TEXTA_classifier_for_non-classically.pdf.txtA_classifier_for_non-classically.pdf.txtExtracted texttext/plain46588https://repository.urosario.edu.co/bitstreams/c6936878-af7e-4edf-b952-b8a52c70bf09/download93d2cdd7bdb85c428a1cfc62ac2b2fe1MD52THUMBNAILA_classifier_for_non-classically.pdf.jpgA_classifier_for_non-classically.pdf.jpgGenerated Thumbnailimage/jpeg4559https://repository.urosario.edu.co/bitstreams/0b814e8a-2a2c-4fb8-8cf7-27994d607c2b/download912b061cb0eb0b4db6c9a8b089b8d37dMD5310336/21891oai:repository.urosario.edu.co:10336/218912020-05-13 14:49:16.89https://repository.urosario.edu.coRepositorio institucional EdocURedocur@urosario.edu.co
dc.title.spa.fl_str_mv NClassG+ : A classifier for non-classically secreted Gram-positive bacterial proteins
title NClassG+ : A classifier for non-classically secreted Gram-positive bacterial proteins
spellingShingle NClassG+ : A classifier for non-classically secreted Gram-positive bacterial proteins
Enfermedades
Microbiología
Support vector machine
Dipeptide
Matthews correlation coefficient
Gaussian Kernel function
title_short NClassG+ : A classifier for non-classically secreted Gram-positive bacterial proteins
title_full NClassG+ : A classifier for non-classically secreted Gram-positive bacterial proteins
title_fullStr NClassG+ : A classifier for non-classically secreted Gram-positive bacterial proteins
title_full_unstemmed NClassG+ : A classifier for non-classically secreted Gram-positive bacterial proteins
title_sort NClassG+ : A classifier for non-classically secreted Gram-positive bacterial proteins
dc.subject.ddc.spa.fl_str_mv Enfermedades
Microbiología
topic Enfermedades
Microbiología
Support vector machine
Dipeptide
Matthews correlation coefficient
Gaussian Kernel function
dc.subject.keyword.spa.fl_str_mv Support vector machine
Dipeptide
Matthews correlation coefficient
Gaussian Kernel function
description Background: Most predictive methods currently available for the identification of protein secretion mechanisms have focused on classically secreted proteins. In fact, only two methods have been reported for predicting non-classically secreted proteins of Gram-positive bacteria. This study describes the implementation of a sequence-based classifier, denoted as NClassG+, for identifying non-classically secreted Gram-positive bacterial proteins.Results: Several feature-based classifiers were trained using different sequence transformation vectors (frequencies, dipeptides, physicochemical factors and PSSM) and Support Vector Machines (SVMs) with Linear, Polynomial and Gaussian kernel functions. Nested k-fold cross-validation (CV) was applied to select the best models, using the inner CV loop to tune the model parameters and the outer CV group to compute the error. The parameters and Kernel functions and the combinations between all possible feature vectors were optimized using grid search.Conclusions: The final model was tested against an independent set not previously seen by the model, obtaining better predictive performance compared to SecretomeP V2.0 and SecretPV2.0 for the identification of non-classically secreted proteins. NClassG+ is freely available on the web at http://www.biolisi.unal.edu.co/web-servers/nclassgpositive/. © 2011 Restrepo-Montoya et al; licensee BioMed Central Ltd.
publishDate 2011
dc.date.created.none.fl_str_mv 2011
dc.date.issued.none.fl_str_mv 2011
dc.date.accessioned.none.fl_str_mv 2020-05-07T13:44:02Z
dc.date.available.none.fl_str_mv 2020-05-07T13:44:02Z
dc.type.eng.fl_str_mv article
dc.type.coarversion.fl_str_mv http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.coar.fl_str_mv http://purl.org/coar/resource_type/c_6501
dc.type.spa.spa.fl_str_mv Artículo
dc.identifier.doi.none.fl_str_mv https://doi.org/10.1186/1471-2105-12-21
dc.identifier.issn.none.fl_str_mv 1471-2105
dc.identifier.uri.none.fl_str_mv https://repository.urosario.edu.co/handle/10336/21891
url https://doi.org/10.1186/1471-2105-12-21
https://repository.urosario.edu.co/handle/10336/21891
identifier_str_mv 1471-2105
dc.language.iso.spa.fl_str_mv eng
language eng
dc.relation.citationTitle.none.fl_str_mv BMC Bioinformatics
dc.relation.citationVolume.none.fl_str_mv Vol. 12
dc.relation.ispartof.spa.fl_str_mv BMC Bioinformatics, ISSN: 1471-2105 Vol. 12, (2011)
dc.relation.uri.spa.fl_str_mv https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-21
dc.rights.coar.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.rights.acceso.spa.fl_str_mv Abierto (Texto Completo)
rights_invalid_str_mv Abierto (Texto Completo)
http://purl.org/coar/access_right/c_abf2
dc.format.mimetype.none.fl_str_mv application/pdf
institution Universidad del Rosario
dc.source.instname.none.fl_str_mv instname:Universidad del Rosario
dc.source.reponame.none.fl_str_mv reponame:Repositorio Institucional EdocUR
bitstream.url.fl_str_mv https://repository.urosario.edu.co/bitstreams/ae8edfe3-743a-4bf6-b0ed-631fff895e34/download
https://repository.urosario.edu.co/bitstreams/c6936878-af7e-4edf-b952-b8a52c70bf09/download
https://repository.urosario.edu.co/bitstreams/0b814e8a-2a2c-4fb8-8cf7-27994d607c2b/download
bitstream.checksum.fl_str_mv a8701723a64f12afb336cfb7a2bd2333
93d2cdd7bdb85c428a1cfc62ac2b2fe1
912b061cb0eb0b4db6c9a8b089b8d37d
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositorio institucional EdocUR
repository.mail.fl_str_mv edocur@urosario.edu.co
_version_ 1808390984286863360