Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences

Motivation: The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new nonredund...

Full description

Autores:
Tipo de recurso:
Fecha de publicación:
2015
Institución:
Universidad Tecnológica de Bolívar
Repositorio:
Repositorio Institucional UTB
Idioma:
eng
OAI Identifier:
oai:repositorio.utb.edu.co:20.500.12585/8759
Acceso en línea:
https://hdl.handle.net/20.500.12585/8759
Palabra clave:
Antimicrobial cationic peptide
Algorithm
Chemistry
Human
Nucleic acid database
Procedures
Protein database
Sequence analysis
Software
Algorithms
Antimicrobial cationic peptides
Databases, Nucleic Acid
Databases, Protein
Humans
Sequence Analysis, Protein
Software
Rights
openAccess
License
http://creativecommons.org/licenses/by-nc-nd/4.0/
id UTB2_d1933adf50f32d688e59bf621b4ad614
oai_identifier_str oai:repositorio.utb.edu.co:20.500.12585/8759
network_acronym_str UTB2
network_name_str Repositorio Institucional UTB
repository_id_str
dc.title.none.fl_str_mv Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences
title Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences
spellingShingle Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences
Antimicrobial cationic peptide
Algorithm
Chemistry
Human
Nucleic acid database
Procedures
Protein database
Sequence analysis
Software
Algorithms
Antimicrobial cationic peptides
Databases, Nucleic Acid
Databases, Protein
Humans
Sequence Analysis, Protein
Software
title_short Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences
title_full Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences
title_fullStr Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences
title_full_unstemmed Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences
title_sort Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences
dc.subject.keywords.none.fl_str_mv Antimicrobial cationic peptide
Algorithm
Chemistry
Human
Nucleic acid database
Procedures
Protein database
Sequence analysis
Software
Algorithms
Antimicrobial cationic peptides
Databases, Nucleic Acid
Databases, Protein
Humans
Sequence Analysis, Protein
Software
topic Antimicrobial cationic peptide
Algorithm
Chemistry
Human
Nucleic acid database
Procedures
Protein database
Sequence analysis
Software
Algorithms
Antimicrobial cationic peptides
Databases, Nucleic Acid
Databases, Protein
Humans
Sequence Analysis, Protein
Software
description Motivation: The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new nonredundant sequence database. For this purpose, a new software tool is introduced. Results: A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP-Patent database are included in CAMP-Patent. However, the majority of databases have their own set of unique sequences, as well as some overlap with other databases. The complete set of non-duplicate sequences comprises 16 990 cases, which is almost half of the total number of reported peptides. On the other hand, the diversity analysis identifies the most and least diverse databases and proves that all databases exhibit some level of redundancy. Finally, we present a new parallel-free software, named Dover Analyzer, developed to compute the overlap and diversity between any number of databases and compile a set of non-redundant sequences. These results are useful for selecting or building a suitable representative set of AMPs, according to specific needs. © The Author 2015. Published by Oxford University Press. All rights reserved.
publishDate 2015
dc.date.issued.none.fl_str_mv 2015
dc.date.accessioned.none.fl_str_mv 2019-11-06T19:05:19Z
dc.date.available.none.fl_str_mv 2019-11-06T19:05:19Z
dc.type.coarversion.fl_str_mv http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.coar.fl_str_mv http://purl.org/coar/resource_type/c_2df8fbb1
dc.type.driver.none.fl_str_mv info:eu-repo/semantics/article
dc.type.hasversion.none.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.spa.none.fl_str_mv Artículo
status_str publishedVersion
dc.identifier.citation.none.fl_str_mv Bioinformatics; Vol. 31, Núm. 15; pp. 2553-2559
dc.identifier.issn.none.fl_str_mv 1367-4803
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/20.500.12585/8759
dc.identifier.doi.none.fl_str_mv 10.1093/bioinformatics/btv180
dc.identifier.instname.none.fl_str_mv Universidad Tecnológica de Bolívar
dc.identifier.reponame.none.fl_str_mv Repositorio UTB
identifier_str_mv Bioinformatics; Vol. 31, Núm. 15; pp. 2553-2559
1367-4803
10.1093/bioinformatics/btv180
Universidad Tecnológica de Bolívar
Repositorio UTB
url https://hdl.handle.net/20.500.12585/8759
dc.language.iso.none.fl_str_mv eng
language eng
dc.rights.coar.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.rights.uri.none.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.accessrights.none.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.cc.none.fl_str_mv Atribución-NoComercial 4.0 Internacional
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
Atribución-NoComercial 4.0 Internacional
http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv openAccess
dc.format.medium.none.fl_str_mv Recurso electrónico
dc.format.mimetype.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Oxford University Press
publisher.none.fl_str_mv Oxford University Press
dc.source.none.fl_str_mv https://www2.scopus.com/inward/record.uri?eid=2-s2.0-84943601205&doi=10.1093%2fbioinformatics%2fbtv180&partnerID=40&md5=141f38519649235606ce49338df9709e
Scopus 56035076700
Scopus 55665599200
Scopus 56897175400
Scopus 56035437800
Scopus 7102811634
Scopus 55363486500
Scopus 56896521800
institution Universidad Tecnológica de Bolívar
bitstream.url.fl_str_mv https://repositorio.utb.edu.co/bitstream/20.500.12585/8759/1/DOI10_1093bioinformaticsbtv180.pdf
https://repositorio.utb.edu.co/bitstream/20.500.12585/8759/4/DOI10_1093bioinformaticsbtv180.pdf.txt
https://repositorio.utb.edu.co/bitstream/20.500.12585/8759/5/DOI10_1093bioinformaticsbtv180.pdf.jpg
bitstream.checksum.fl_str_mv b597c548ffda4d59dc2eb405bbc7ebd0
e675d00a29cec4b65a1014ce0b56c6f8
1c2091101bfcc539d6a9582b37e1780e
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional UTB
repository.mail.fl_str_mv repositorioutb@utb.edu.co
_version_ 1808397574585974784
spelling 2019-11-06T19:05:19Z2019-11-06T19:05:19Z2015Bioinformatics; Vol. 31, Núm. 15; pp. 2553-25591367-4803https://hdl.handle.net/20.500.12585/875910.1093/bioinformatics/btv180Universidad Tecnológica de BolívarRepositorio UTBMotivation: The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new nonredundant sequence database. For this purpose, a new software tool is introduced. Results: A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP-Patent database are included in CAMP-Patent. However, the majority of databases have their own set of unique sequences, as well as some overlap with other databases. The complete set of non-duplicate sequences comprises 16 990 cases, which is almost half of the total number of reported peptides. On the other hand, the diversity analysis identifies the most and least diverse databases and proves that all databases exhibit some level of redundancy. Finally, we present a new parallel-free software, named Dover Analyzer, developed to compute the overlap and diversity between any number of databases and compile a set of non-redundant sequences. These results are useful for selecting or building a suitable representative set of AMPs, according to specific needs. © The Author 2015. Published by Oxford University Press. All rights reserved.Antimicrobial Cationic PeptidesRecurso electrónicoapplication/pdfengOxford University Presshttp://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccessAtribución-NoComercial 4.0 Internacionalhttp://purl.org/coar/access_right/c_abf2https://www2.scopus.com/inward/record.uri?eid=2-s2.0-84943601205&doi=10.1093%2fbioinformatics%2fbtv180&partnerID=40&md5=141f38519649235606ce49338df9709eScopus 56035076700Scopus 55665599200Scopus 56897175400Scopus 56035437800Scopus 7102811634Scopus 55363486500Scopus 56896521800Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequencesinfo:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArtículohttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/resource_type/c_2df8fbb1Antimicrobial cationic peptideAlgorithmChemistryHumanNucleic acid databaseProceduresProtein databaseSequence analysisSoftwareAlgorithmsAntimicrobial cationic peptidesDatabases, Nucleic AcidDatabases, ProteinHumansSequence Analysis, ProteinSoftwareMarrero-Ponce, Y.Marrero-Ponce, Y.Tellez-Ibarra, R.Llorente-Quesada, M.T.Salgado, J.Barigye, S.J.Liu, J.Brahmachary, M., Antimic: A database of antimicrobial sequences (2004) Nucleic Acids Res., 32, pp. D586-D589Chugh, J., Wallace, B., Peptaibols: Models for ion channels (2001) Biochem. Soc. Trans., 29, pp. 565-570Cotter, P.D., Bacteriocins: Developing innate immunity for food (2005) Nat. Rev. Microbiol., 3, pp. 777-788De Jong, A., Bagel2: Mining for bacteriocins in genomic data (2010) Nucleic Acids Res., 38, pp. W647-W651Engler, A.C., Emerging trends in macromolecular antimicrobials to fight multi-drug-resistant infections (2012) Nano Today, 7, pp. 201-222Fjell, C.D., AMPer: A database and an automated discovery tool for antimicrobial peptides (2007) Bioinformatics, 23, pp. 1148-1155Fjell, C.D., Designing antimicrobial peptides: Form follows function (2012) Nat. Rev. Drug Discov., 11, pp. 37-51Ganz, T., Defensins: Antimicrobial peptides of innate immunity (2003) Nat. Rev. Immunol., 3, pp. 710-720Gaspar, D., From antimicrobial to anticancer peptides (2013) A Review. Front. Microbiol., 4, p. 294Gogoladze, G., DBAASP: Database of antimicrobial activity and structure of peptides (2014) FEMS Microbiol. Lett., 357, pp. 63-68Gueguen, Y., Penbase, the shrimp antimicrobial peptide penaeidin database: Sequence-based classification and recommended nomenclature (2006) Dev. Comp. Immunol., 30, pp. 283-288Hammami, R., Phytamp: A database dedicated to antimicrobial plant peptides (2009) Nucleic Acids Res., 37, pp. D963-D968Hammami, R., Bactibase second release: A database and tool platform for bacteriocin characterization (2010) BMC Microbiol., 10, p. 22Holland, R.C., Biojava: An open-source framework for bioinformatics (2008) Bioinformatics, 24, pp. 2096-2097Holm, L., Sander, C., Removing near-neighbour redundancy from large protein sequence collections (1998) Bioinformatics, 14, pp. 423-429Jenssen, H., A wide range of medium-sized, highly cationic, a-helical peptides show antiviral activity against herpes simplex virus (2004) Antiviral Res., 64, pp. 119-126Li, Y., Chen, Z., RAPD: A database of recombinantly-produced antimicrobial peptides (2008) FEMS Microbiol. Lett, 289, pp. 126-129Magrane, M., Uniprot knowledgebase: A hub of integrated protein data (2011) Database, , 2011Mor, A., Multifunctional host defense peptides: Antiparasitic activities (2009) FEBS J., 276, pp. 6474-6482Needleman, S.B., Wunsch, C.D., A general method applicable to the search for similarities in the amino acid sequence of two proteins (1970) J. Mol. Biol., 48, pp. 443-453Novković, M., DADP: The database of anuran defense peptides (2012) Bioinformatics, 28, pp. 1406-1407Piotto, S.P., Yadamp: Yet another database of antimicrobial peptides (2012) Int. J Antimicrob. Agents, 39, pp. 346-351Qureshi, A., Hipdb: A database of experimentally validated HIV inhibiting peptides (2013) PloS One, 8, p. e54908Qureshi, A., Avpdb: A database of experimentally validated antiviral peptides targeting medically important viruses (2014) Nucleic Acids Res., 42, pp. D1147-D1153Seebah, S., Defensins knowledgebase: A manually curated database and information source focused on the defensins family of antimicrobial peptides (2007) Nucleic Acids Res., 35, pp. D265-D268Sundararajan, V.S., DAMPD: A manually curated antimicrobial peptide database (2012) Nucleic Acids Res., 40, pp. D1108-D1112Théolier, J., Milkamp: A comprehensive database of antimicrobial peptides of dairy origin (2014) Dairy Sci. Technol., 94, pp. 181-193Torrent, M., Connecting peptide physicochemical and antimicrobial properties by a rational prediction model (2011) PloS One, 6, p. e16968Tossi, A., Sandri, L., Molecular diversity in gene-encoded, cationic antimicrobial polypeptides (2002) Curr. Pharm. Des., 8, pp. 743-761Voigt, J.H., Comparison of the NCI open database with seven large chemical structural databases (2001) J. Chem. Inf. Comput. Sci, 41, pp. 702-712Waghu, F.H., Camp: Collection of sequences and structures of antimicrobial peptides (2014) Nucleic Acids Res., 42, pp. D1154-D1158Wang, G., APD2: The updated antimicrobial peptide database and its application in peptide design (2009) Nucleic Acids Res., 37, pp. D933-D937Wang, Z., Wang, G., APD: The antimicrobial peptide database (2004) Nucleic Acids Res., 32, pp. D590-D592Whitmore, L., Wallace, B., The peptaibol database: A database for sequences and structures of naturally occurring peptaibols (2004) Nucleic Acids Res., 32, pp. D593-D594Willey, J.M., Van Der Donk, W.A., Lantibiotics: Peptides of diverse structure and function (2007) Annu. Rev. Microbiol., 61, pp. 477-501Zhao, X., Lamp: A database linking antimicrobial peptides (2013) PloS One, 8, p. e66557http://purl.org/coar/resource_type/c_6501ORIGINALDOI10_1093bioinformaticsbtv180.pdfapplication/pdf296423https://repositorio.utb.edu.co/bitstream/20.500.12585/8759/1/DOI10_1093bioinformaticsbtv180.pdfb597c548ffda4d59dc2eb405bbc7ebd0MD51TEXTDOI10_1093bioinformaticsbtv180.pdf.txtDOI10_1093bioinformaticsbtv180.pdf.txtExtracted texttext/plain38711https://repositorio.utb.edu.co/bitstream/20.500.12585/8759/4/DOI10_1093bioinformaticsbtv180.pdf.txte675d00a29cec4b65a1014ce0b56c6f8MD54THUMBNAILDOI10_1093bioinformaticsbtv180.pdf.jpgDOI10_1093bioinformaticsbtv180.pdf.jpgGenerated Thumbnailimage/jpeg82812https://repositorio.utb.edu.co/bitstream/20.500.12585/8759/5/DOI10_1093bioinformaticsbtv180.pdf.jpg1c2091101bfcc539d6a9582b37e1780eMD5520.500.12585/8759oai:repositorio.utb.edu.co:20.500.12585/87592023-05-26 16:14:27.37Repositorio Institucional UTBrepositorioutb@utb.edu.co