Multi-GPU distribution of single-batch, time-dependent linear products

Modern approaches to distributed deep learning focus on using more GPU nodes to process more data in parallel, updating the model weights using a distributed gradient update rule across all nodes. The main limitation of this paradigm is that it assumes that at least one sample of data can fit in a s...

Full description

Autores:
Margffoy Tuay, Edgar Andrés
Tipo de recurso:
Fecha de publicación:
2020
Institución:
Universidad de los Andes
Repositorio:
Séneca: repositorio Uniandes
Idioma:
eng
OAI Identifier:
oai:repositorio.uniandes.edu.co:1992/48619
Acceso en línea:
http://hdl.handle.net/1992/48619
Palabra clave:
Unidades de procesamiento gráfico
Aprendizaje automático (Inteligencia artificial)
Ingeniería
Rights
openAccess
License
http://creativecommons.org/licenses/by-nc-nd/4.0/
id UNIANDES2_29ad920a3f74d937c0b6d1afdc81b55c
oai_identifier_str oai:repositorio.uniandes.edu.co:1992/48619
network_acronym_str UNIANDES2
network_name_str Séneca: repositorio Uniandes
repository_id_str
dc.title.es_CO.fl_str_mv Multi-GPU distribution of single-batch, time-dependent linear products
title Multi-GPU distribution of single-batch, time-dependent linear products
spellingShingle Multi-GPU distribution of single-batch, time-dependent linear products
Unidades de procesamiento gráfico
Aprendizaje automático (Inteligencia artificial)
Ingeniería
title_short Multi-GPU distribution of single-batch, time-dependent linear products
title_full Multi-GPU distribution of single-batch, time-dependent linear products
title_fullStr Multi-GPU distribution of single-batch, time-dependent linear products
title_full_unstemmed Multi-GPU distribution of single-batch, time-dependent linear products
title_sort Multi-GPU distribution of single-batch, time-dependent linear products
dc.creator.fl_str_mv Margffoy Tuay, Edgar Andrés
dc.contributor.advisor.none.fl_str_mv Cardozo Álvarez, Nicolás
Arbeláez Escalante, Pablo Andrés
dc.contributor.author.none.fl_str_mv Margffoy Tuay, Edgar Andrés
dc.contributor.jury.none.fl_str_mv Castro Barrera, Harold Enrique
dc.subject.armarc.es_CO.fl_str_mv Unidades de procesamiento gráfico
Aprendizaje automático (Inteligencia artificial)
topic Unidades de procesamiento gráfico
Aprendizaje automático (Inteligencia artificial)
Ingeniería
dc.subject.themes.none.fl_str_mv Ingeniería
description Modern approaches to distributed deep learning focus on using more GPU nodes to process more data in parallel, updating the model weights using a distributed gradient update rule across all nodes. The main limitation of this paradigm is that it assumes that at least one sample of data can fit in a single node. However, that does not hold when dealing with large inputs or, when GPU infrastructure does not have enough memory. In this paper, we propose a new operator-level distribution approach, tailored to the aforementioned cases in which, we distribute a single input of data across multiple GPU nodes, taking into account the operators involved in a given model. By distributing the original input, we are able to reduce the space complexity of each node, thus enabling multiple GPUs to process inputs that could not fit in a single node. We validate our approach by distributing the dot product attention, a fundamental operation in modern sequence-to-sequence architectures
publishDate 2020
dc.date.issued.es_CO.fl_str_mv 2020
dc.date.accessioned.none.fl_str_mv 2021-02-18T12:25:02Z
dc.date.available.none.fl_str_mv 2021-02-18T12:25:02Z
dc.type.spa.fl_str_mv Trabajo de grado - Maestría
dc.type.coarversion.fl_str_mv http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.driver.spa.fl_str_mv info:eu-repo/semantics/masterThesis
dc.type.content.spa.fl_str_mv Text
dc.type.redcol.spa.fl_str_mv http://purl.org/redcol/resource_type/TM
dc.identifier.uri.none.fl_str_mv http://hdl.handle.net/1992/48619
dc.identifier.pdf.none.fl_str_mv u833168.pdf
dc.identifier.instname.spa.fl_str_mv instname:Universidad de los Andes
dc.identifier.reponame.spa.fl_str_mv reponame:Repositorio Institucional Séneca
dc.identifier.repourl.spa.fl_str_mv repourl:https://repositorio.uniandes.edu.co/
url http://hdl.handle.net/1992/48619
identifier_str_mv u833168.pdf
instname:Universidad de los Andes
reponame:Repositorio Institucional Séneca
repourl:https://repositorio.uniandes.edu.co/
dc.language.iso.es_CO.fl_str_mv eng
language eng
dc.rights.uri.*.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.accessrights.spa.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.coar.spa.fl_str_mv http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv openAccess
dc.format.extent.es_CO.fl_str_mv 37 hojas
dc.format.mimetype.es_CO.fl_str_mv application/pdf
dc.publisher.es_CO.fl_str_mv Universidad de los Andes
dc.publisher.program.es_CO.fl_str_mv Maestría en Ingeniería de Sistemas y Computación
dc.publisher.faculty.es_CO.fl_str_mv Facultad de Ingeniería
dc.publisher.department.es_CO.fl_str_mv Departamento de Ingeniería de Sistemas y Computación
dc.source.es_CO.fl_str_mv instname:Universidad de los Andes
reponame:Repositorio Institucional Séneca
instname_str Universidad de los Andes
institution Universidad de los Andes
reponame_str Repositorio Institucional Séneca
collection Repositorio Institucional Séneca
bitstream.url.fl_str_mv https://repositorio.uniandes.edu.co/bitstreams/95d81d46-243e-4677-86ee-a6b26c183b51/download
https://repositorio.uniandes.edu.co/bitstreams/5bd198a7-da98-4804-bcec-52b769a81b26/download
https://repositorio.uniandes.edu.co/bitstreams/fe2716db-e621-42ce-9592-1edb7e03914f/download
bitstream.checksum.fl_str_mv 0239cf97b9480dc135dc8d67b5904141
1f0ea1b7612df11a55559afc830f6936
5f43072fddafb6fd4992ce4eb3feccd6
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositorio institucional Séneca
repository.mail.fl_str_mv adminrepositorio@uniandes.edu.co
_version_ 1812134064902111232
spelling Al consultar y hacer uso de este recurso, está aceptando las condiciones de uso establecidas por los autores.http://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Cardozo Álvarez, Nicolásvirtual::16716-1Arbeláez Escalante, Pablo Andrésvirtual::16717-1Margffoy Tuay, Edgar Andrés45ac0bb0-56d2-4f65-93bd-438dad6dad49500Castro Barrera, Harold Enrique2021-02-18T12:25:02Z2021-02-18T12:25:02Z2020http://hdl.handle.net/1992/48619u833168.pdfinstname:Universidad de los Andesreponame:Repositorio Institucional Sénecarepourl:https://repositorio.uniandes.edu.co/Modern approaches to distributed deep learning focus on using more GPU nodes to process more data in parallel, updating the model weights using a distributed gradient update rule across all nodes. The main limitation of this paradigm is that it assumes that at least one sample of data can fit in a single node. However, that does not hold when dealing with large inputs or, when GPU infrastructure does not have enough memory. In this paper, we propose a new operator-level distribution approach, tailored to the aforementioned cases in which, we distribute a single input of data across multiple GPU nodes, taking into account the operators involved in a given model. By distributing the original input, we are able to reduce the space complexity of each node, thus enabling multiple GPUs to process inputs that could not fit in a single node. We validate our approach by distributing the dot product attention, a fundamental operation in modern sequence-to-sequence architecturesLos enfoques tradicionales al entrenamiento distribuidos de aprendizaje profundo parten del principio que al menos una instancia de entrada cabe en la memoria de un solo nodo CPU/GPU. Sin embargo, fallan al momento en el que la entrada no cabe en memoria, debido al tamaño del modelo o la misma entrada. En este trabajo, se propone un nuevo enfoque para distribuir modelos de aprendizaje profundo, basado en la distribución de operadores, la cual consiste en realizar una partición de la entrada, la cual se distribuye a través de múltiples GPUs, teniendo en cuenta los operadores involucrados. El paradigma propuesto habilita el entrenamiento de modelos que cuentan con restricciones de espacio. Validamos la propuesta al distribuir los productos lineales involucrados en la atención por producto punto, una operación fundamental en las arquitecturas modernas de sequencia a sequenciaMagíster en Ingeniería de Sistemas y ComputaciónMaestría37 hojasapplication/pdfengUniversidad de los AndesMaestría en Ingeniería de Sistemas y ComputaciónFacultad de IngenieríaDepartamento de Ingeniería de Sistemas y Computacióninstname:Universidad de los Andesreponame:Repositorio Institucional SénecaMulti-GPU distribution of single-batch, time-dependent linear productsTrabajo de grado - Maestríainfo:eu-repo/semantics/masterThesishttp://purl.org/coar/version/c_970fb48d4fbd8a85Texthttp://purl.org/redcol/resource_type/TMUnidades de procesamiento gráficoAprendizaje automático (Inteligencia artificial)IngenieríaPublicationhttps://scholar.google.es/citations?user=3iTzjQsAAAAJvirtual::16716-1https://scholar.google.es/citations?user=k0nZO90AAAAJvirtual::16717-10000-0002-1094-9952virtual::16716-10000-0001-5244-2407virtual::16717-1https://scienti.minciencias.gov.co/cvlac/visualizador/generarCurriculoCv.do?cod_rh=0001579086virtual::16717-1a77ff528-fc33-44d6-9022-814f81ef407avirtual::16716-1b4f52d42-ce2a-4e74-a22f-e52a6bfbd48evirtual::16717-1a77ff528-fc33-44d6-9022-814f81ef407avirtual::16716-1b4f52d42-ce2a-4e74-a22f-e52a6bfbd48evirtual::16717-1ORIGINALu833168.pdfapplication/pdf2364134https://repositorio.uniandes.edu.co/bitstreams/95d81d46-243e-4677-86ee-a6b26c183b51/download0239cf97b9480dc135dc8d67b5904141MD51TEXTu833168.pdf.txtu833168.pdf.txtExtracted texttext/plain85797https://repositorio.uniandes.edu.co/bitstreams/5bd198a7-da98-4804-bcec-52b769a81b26/download1f0ea1b7612df11a55559afc830f6936MD54THUMBNAILu833168.pdf.jpgu833168.pdf.jpgIM Thumbnailimage/jpeg7231https://repositorio.uniandes.edu.co/bitstreams/fe2716db-e621-42ce-9592-1edb7e03914f/download5f43072fddafb6fd4992ce4eb3feccd6MD551992/48619oai:repositorio.uniandes.edu.co:1992/486192024-03-13 15:48:03.616http://creativecommons.org/licenses/by-nc-nd/4.0/open.accesshttps://repositorio.uniandes.edu.coRepositorio institucional Sénecaadminrepositorio@uniandes.edu.co