Machine learning-based cancer classification using gene expression data

In this Masters thesis we explore some machine and deep learning algorithms to classify different types of cancer-based on the gene expression profile of each sample. We use expression profiles of both cancer tissue and normal tissue to train the predictive models. The abnormal tissue samples were o...

Full description

Autores:
Martínez Logreira, Julián Alexander
Tipo de recurso:
Fecha de publicación:
2020
Institución:
Universidad de los Andes
Repositorio:
Séneca: repositorio Uniandes
Idioma:
eng
OAI Identifier:
oai:repositorio.uniandes.edu.co:1992/50946
Acceso en línea:
http://hdl.handle.net/1992/50946
Palabra clave:
Cáncer
Genómica
Simulación por computadores
Aprendizaje automático (Inteligencia artificial)
Ingeniería
Rights
openAccess
License
http://creativecommons.org/licenses/by-nc-nd/4.0/
Description
Summary:In this Masters thesis we explore some machine and deep learning algorithms to classify different types of cancer-based on the gene expression profile of each sample. We use expression profiles of both cancer tissue and normal tissue to train the predictive models. The abnormal tissue samples were obtained from The Cancer Genome Atlas (TCGA) and pair with control (normal) tissue samples from The Genotype-Tissue Expression project (GTEx), both public databases. We implemented ensembles of classic machine learning algorithms showing an accuracy up to 16% approximately. We also implemented a graph convolutional network (GCN) in which a top performance of 52% accuracy approximately was reached. These results suggest the potential of graph-based algorithms to find underlying patterns on weakly structured data.