Multi-class Superfamily Prediction Using 3D Models Enriched with Physicochemical Properties.

(Eng) In this paper, two new methods that address the multi-class superfamily prediction problem are presented. In the multi-class superfamily recognition problem each amino acid sequence has to be classified into one of the known structural classes (i.e., superfamilies). Most of the strategies that...

Full description

Autores:
Bedoya Leiva, Oscar Fernando
Tischer, Irene
Tipo de recurso:
Article of journal
Fecha de publicación:
2016
Institución:
Universidad del Valle
Repositorio:
Repositorio Digital Univalle
Idioma:
eng
OAI Identifier:
oai:bibliotecadigital.univalle.edu.co:10893/18334
Acceso en línea:
https://hdl.handle.net/10893/18334
Palabra clave:
Predicción de superfamilias
Propiedades fisicoquímicas
Clasificadores binarios
Superfamilia SCOP
Modelos 3D enriquecidos
Rights
closedAccess
License
http://purl.org/coar/access_right/c_14cb
Description
Summary:(Eng) In this paper, two new methods that address the multi-class superfamily prediction problem are presented. In the multi-class superfamily recognition problem each amino acid sequence has to be classified into one of the known structural classes (i.e., superfamilies). Most of the strategies that have been proposed to predict superfamilies are based on using the binary classifiers that detect remote homologs. The remote homology detection problem is about finding a classifier that is able to separate remote homologs from non-remote homologs. The current methods for multi-class superfamily recognition take the outputs of the binary classifier (i.e., the scores) for each SCOP superfamily in the data set and build a classification model (i.e., multi-class classifier). Unlike the current methods, which represent a protein considering the amino acids composition, in this research we use the number of times that 3D models enriched with physicochemical properties occur in both its predicted contact map and its interaction matrix. We hypothesize that including both 3D information and physicochemical properties might have an impact in the accuracy obtained during the superfamily prediction. In this paper, we present two new strategies for predicting superfamilies that use 3D models enriched with physicochemical properties, the single-MCS and the hierarchical-MCS methods, which reach an accuracy percentage of 74% and 76% on the SCOP 1.53 data set, respectively. In addition, tests on the SCOP 1.55 and the SCOP 1.61 are also presented.