A novel methodology for characterizing and predicting protein functional sites
Since there is a strong need for computational methods to predict and characterize functional sites for initial anno- tations of protein structures, a new methodology that relies on descriptions of the functional sites based on local prop- erties is proposed in this paper. This new approach is in- d...
- Autores:
- Tipo de recurso:
- Fecha de publicación:
- 2008
- Institución:
- Universidad del Rosario
- Repositorio:
- Repositorio EdocUR - U. Rosario
- Idioma:
- eng
- OAI Identifier:
- oai:repository.urosario.edu.co:10336/28870
- Acceso en línea:
- https://doi.org/10.1109/BIBM.2007.36
https://repository.urosario.edu.co/handle/10336/28870
- Palabra clave:
- Functional genomics
Protein functional sites
Feature extraction
Clustering
Classification
Metalbinding sites
- Rights
- License
- Restringido (Acceso a grupos específicos)
Summary: | Since there is a strong need for computational methods to predict and characterize functional sites for initial anno- tations of protein structures, a new methodology that relies on descriptions of the functional sites based on local prop- erties is proposed in this paper. This new approach is in- dependent of conserved residues and conserved residue ge- ometry and takes advantage of the large number of protein structures available to construct models using a machine learning approach. Particularly, the proposed method per- formed feature extraction, clustering and classification on a protein structure data set, and it was validated on metal- binding sites (Ca2+, Zn2+, Na+,K+, Mg2+, Mn2+, Cu2+, Fe3+, Hg2+, Cl-) present in a non-redundant PDB (a total of 11,959 metal-binding sites in 3,609 proteins). Feature extraction provided a description of critical fea- tures for each metal-binding site, which were consistent with prior knowledge about them. Furthermore, new in- sights about metal-binding site microenvironments could be provided by the descriptors thus obtained. Results using k-fold cross-validation for classification showed accuracy above 90%. Complete proteins were scanned using these classifiers to locate metal-binding sites. Keywords: Functional Genomics, Protein functional sites, Feature Extraction, Clustering, Classification, Metal- binding sites. |
---|