Characterizing and predicting catalytic residues in enzyme active sites based on local properties: a machine learning approach
Developing computational methods for assigning protein function from tertiary structure is a very important problem, predicting a catalytic mechanism based only on structural information being a particularly challenging task. This work focuses on helping to understand the molecular basis of catalysi...
- Autores:
- Tipo de recurso:
- Fecha de publicación:
- 2007
- Institución:
- Universidad del Rosario
- Repositorio:
- Repositorio EdocUR - U. Rosario
- Idioma:
- eng
- OAI Identifier:
- oai:repository.urosario.edu.co:10336/28861
- Acceso en línea:
- https://doi.org/10.1109/BIBE.2007.4375671
https://repository.urosario.edu.co/handle/10336/28861
- Palabra clave:
- Biochemistry
Machine learning
Sequences
Genomics
Bioinformatics
Predictive models
Protein engineering
Nuclear magnetic resonance
Data mining
Crystallography
- Rights
- License
- Restringido (Acceso a grupos específicos)
id |
EDOCUR2_e3d4aeccdbabd52f9ea66049c99094b9 |
---|---|
oai_identifier_str |
oai:repository.urosario.edu.co:10336/28861 |
network_acronym_str |
EDOCUR2 |
network_name_str |
Repositorio EdocUR - U. Rosario |
repository_id_str |
|
spelling |
bf233df3-bd1a-4a43-ac6f-6646ac82b6b9-1ee461b88-8e5e-435a-9765-7315b8ce290a-1059f6e0f-cd27-4217-8638-233e8dafe847-179653065-12020-08-28T15:49:56Z2020-08-28T15:49:56Z2007-11-05Developing computational methods for assigning protein function from tertiary structure is a very important problem, predicting a catalytic mechanism based only on structural information being a particularly challenging task. This work focuses on helping to understand the molecular basis of catalysis by exploring the nature of catalytic residues, their environment and characteristic properties in a large data set of enzyme structures and using this information to predict enzyme structures' active sites. A machine learning approach that performs feature extraction, clustering and classification on a protein structure data set is proposed. The 6,376 residues directly involved in enzyme catalysis, present in more than 800 proteins structures in the PDB were analyzed. Feature extraction provided a description of critical features for each catalytic residue, which were consistent with prior knowledge about them. Results from k-fold-cross-validation for classification showed more than 80% accuracy. Complete enzymes were scanned using these classifiers to locate catalytic residues.application/pdfhttps://doi.org/10.1109/BIBE.2007.4375671ISBN: 1-4244-1509-8EISBN: 978-1-4244-1509-0https://repository.urosario.edu.co/handle/10336/28861engIEEE9459382007 IEEE 7th International Symposium on BioInformatics and BioEngineeringIEEE 7th International Symposium on BioInformatics and BioEngineering, ISBN: 1-4244-1509-8;EISBN: 978-1-4244-1509-0 (2007); pp. 938-945https://ieeexplore.ieee.org/document/4375671Restringido (Acceso a grupos específicos)http://purl.org/coar/access_right/c_16ec2007 IEEE 7th International Symposium on BioInformatics and BioEngineeringinstname:Universidad del Rosarioreponame:Repositorio Institucional EdocURBiochemistryMachine learningSequencesGenomicsBioinformaticsPredictive modelsProtein engineeringNuclear magnetic resonanceData miningCrystallographyCharacterizing and predicting catalytic residues in enzyme active sites based on local properties: a machine learning approachCaracterización y predicción de residuos catalíticos en sitios activos de enzimas según propiedades locales: un enfoque de aprendizaje automáticobookPartParte de librohttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/resource_type/c_3248Bobadilla, LeonardoNino, FernandoCepeda, EdilbertoPatarroyo, Manuel A.10336/28861oai:repository.urosario.edu.co:10336/288612021-06-03 00:49:42.299https://repository.urosario.edu.coRepositorio institucional EdocURedocur@urosario.edu.co |
dc.title.spa.fl_str_mv |
Characterizing and predicting catalytic residues in enzyme active sites based on local properties: a machine learning approach |
dc.title.TranslatedTitle.spa.fl_str_mv |
Caracterización y predicción de residuos catalíticos en sitios activos de enzimas según propiedades locales: un enfoque de aprendizaje automático |
title |
Characterizing and predicting catalytic residues in enzyme active sites based on local properties: a machine learning approach |
spellingShingle |
Characterizing and predicting catalytic residues in enzyme active sites based on local properties: a machine learning approach Biochemistry Machine learning Sequences Genomics Bioinformatics Predictive models Protein engineering Nuclear magnetic resonance Data mining Crystallography |
title_short |
Characterizing and predicting catalytic residues in enzyme active sites based on local properties: a machine learning approach |
title_full |
Characterizing and predicting catalytic residues in enzyme active sites based on local properties: a machine learning approach |
title_fullStr |
Characterizing and predicting catalytic residues in enzyme active sites based on local properties: a machine learning approach |
title_full_unstemmed |
Characterizing and predicting catalytic residues in enzyme active sites based on local properties: a machine learning approach |
title_sort |
Characterizing and predicting catalytic residues in enzyme active sites based on local properties: a machine learning approach |
dc.subject.keyword.spa.fl_str_mv |
Biochemistry Machine learning Sequences Genomics Bioinformatics Predictive models Protein engineering Nuclear magnetic resonance Data mining Crystallography |
topic |
Biochemistry Machine learning Sequences Genomics Bioinformatics Predictive models Protein engineering Nuclear magnetic resonance Data mining Crystallography |
description |
Developing computational methods for assigning protein function from tertiary structure is a very important problem, predicting a catalytic mechanism based only on structural information being a particularly challenging task. This work focuses on helping to understand the molecular basis of catalysis by exploring the nature of catalytic residues, their environment and characteristic properties in a large data set of enzyme structures and using this information to predict enzyme structures' active sites. A machine learning approach that performs feature extraction, clustering and classification on a protein structure data set is proposed. The 6,376 residues directly involved in enzyme catalysis, present in more than 800 proteins structures in the PDB were analyzed. Feature extraction provided a description of critical features for each catalytic residue, which were consistent with prior knowledge about them. Results from k-fold-cross-validation for classification showed more than 80% accuracy. Complete enzymes were scanned using these classifiers to locate catalytic residues. |
publishDate |
2007 |
dc.date.created.spa.fl_str_mv |
2007-11-05 |
dc.date.accessioned.none.fl_str_mv |
2020-08-28T15:49:56Z |
dc.date.available.none.fl_str_mv |
2020-08-28T15:49:56Z |
dc.type.eng.fl_str_mv |
bookPart |
dc.type.coarversion.fl_str_mv |
http://purl.org/coar/version/c_970fb48d4fbd8a85 |
dc.type.coar.fl_str_mv |
http://purl.org/coar/resource_type/c_3248 |
dc.type.spa.spa.fl_str_mv |
Parte de libro |
dc.identifier.doi.none.fl_str_mv |
https://doi.org/10.1109/BIBE.2007.4375671 |
dc.identifier.issn.none.fl_str_mv |
ISBN: 1-4244-1509-8 EISBN: 978-1-4244-1509-0 |
dc.identifier.uri.none.fl_str_mv |
https://repository.urosario.edu.co/handle/10336/28861 |
url |
https://doi.org/10.1109/BIBE.2007.4375671 https://repository.urosario.edu.co/handle/10336/28861 |
identifier_str_mv |
ISBN: 1-4244-1509-8 EISBN: 978-1-4244-1509-0 |
dc.language.iso.spa.fl_str_mv |
eng |
language |
eng |
dc.relation.citationEndPage.none.fl_str_mv |
945 |
dc.relation.citationStartPage.none.fl_str_mv |
938 |
dc.relation.citationTitle.none.fl_str_mv |
2007 IEEE 7th International Symposium on BioInformatics and BioEngineering |
dc.relation.ispartof.spa.fl_str_mv |
IEEE 7th International Symposium on BioInformatics and BioEngineering, ISBN: 1-4244-1509-8;EISBN: 978-1-4244-1509-0 (2007); pp. 938-945 |
dc.relation.uri.spa.fl_str_mv |
https://ieeexplore.ieee.org/document/4375671 |
dc.rights.coar.fl_str_mv |
http://purl.org/coar/access_right/c_16ec |
dc.rights.acceso.spa.fl_str_mv |
Restringido (Acceso a grupos específicos) |
rights_invalid_str_mv |
Restringido (Acceso a grupos específicos) http://purl.org/coar/access_right/c_16ec |
dc.format.mimetype.none.fl_str_mv |
application/pdf |
dc.publisher.spa.fl_str_mv |
IEEE |
dc.source.spa.fl_str_mv |
2007 IEEE 7th International Symposium on BioInformatics and BioEngineering |
institution |
Universidad del Rosario |
dc.source.instname.none.fl_str_mv |
instname:Universidad del Rosario |
dc.source.reponame.none.fl_str_mv |
reponame:Repositorio Institucional EdocUR |
repository.name.fl_str_mv |
Repositorio institucional EdocUR |
repository.mail.fl_str_mv |
edocur@urosario.edu.co |
_version_ |
1814167431500791808 |