Implementation of a dna compression algorithm using dataflow computing

RESUMEN: The amount of DNA sequences databases has increased a lot in the last years, the amount of space required to store the sequences is increasing more than the space available to store them, that means a higher cost to store DNA sequences and also the read sequences which are fragments of the...

Full description

Autores:
Caro Serna, Rubén David
Tipo de recurso:
Trabajo de grado de pregrado
Fecha de publicación:
2018
Institución:
Universidad de Antioquia
Repositorio:
Repositorio UdeA
Idioma:
eng
OAI Identifier:
oai:bibliotecadigital.udea.edu.co:10495/13217
Acceso en línea:
http://hdl.handle.net/10495/13217
Palabra clave:
Compression
Dataflow Engine (DFE)
FPGA
Maxeler
Rights
openAccess
License
Atribución-NoComercial-SinDerivadas 2.5 Colombia
id UDEA2_5a04e5e7867ed284ec6c0678e2acec9a
oai_identifier_str oai:bibliotecadigital.udea.edu.co:10495/13217
network_acronym_str UDEA2
network_name_str Repositorio UdeA
repository_id_str
dc.title.spa.fl_str_mv Implementation of a dna compression algorithm using dataflow computing
title Implementation of a dna compression algorithm using dataflow computing
spellingShingle Implementation of a dna compression algorithm using dataflow computing
Compression
Dataflow Engine (DFE)
FPGA
Maxeler
title_short Implementation of a dna compression algorithm using dataflow computing
title_full Implementation of a dna compression algorithm using dataflow computing
title_fullStr Implementation of a dna compression algorithm using dataflow computing
title_full_unstemmed Implementation of a dna compression algorithm using dataflow computing
title_sort Implementation of a dna compression algorithm using dataflow computing
dc.creator.fl_str_mv Caro Serna, Rubén David
dc.contributor.advisor.none.fl_str_mv Isaza Ramírez, Sebastián
dc.contributor.author.none.fl_str_mv Caro Serna, Rubén David
dc.subject.es_ES.fl_str_mv Compression
Dataflow Engine (DFE)
FPGA
Maxeler
topic Compression
Dataflow Engine (DFE)
FPGA
Maxeler
description RESUMEN: The amount of DNA sequences databases has increased a lot in the last years, the amount of space required to store the sequences is increasing more than the space available to store them, that means a higher cost to store DNA sequences and also the read sequences which are fragments of the whole sequence. This situation has led to the use of compression algorithms for storing DNA files. The main objective of the project is to increase the efficiency of the compression of DNA sequences because the process requires a lot of compute. An FPGA with dataflow architecture has been used to develop the project with the aim of exploiting the available parallelism in the algorithm chosen. The compression method has been developed to process sequence reads with a fixed amount of mutations per read and the test has been developed for 4, 8, 12 and 16 mutations per reads using an architecture that allows up to 160 reads to be processed in only one thick. Experimental results showed that even with a low amount of processing units, the performance increases a lot using the DFE architecture, the only disadvantage is the store/reading time. Palabras claves : Compression, Dataflow Engine (DFE), FPGA, CPU, DNA, Maxeler
publishDate 2018
dc.date.issued.none.fl_str_mv 2018
dc.date.accessioned.none.fl_str_mv 2020-01-14T23:29:00Z
dc.date.available.none.fl_str_mv 2020-01-14T23:29:00Z
dc.type.spa.fl_str_mv info:eu-repo/semantics/bachelorThesis
dc.type.coarversion.fl_str_mv http://purl.org/coar/version/c_b1a7d7d4d402bcce
dc.type.hasversion.spa.fl_str_mv info:eu-repo/semantics/draft
dc.type.coar.spa.fl_str_mv http://purl.org/coar/resource_type/c_7a1f
dc.type.redcol.spa.fl_str_mv https://purl.org/redcol/resource_type/TP
dc.type.local.spa.fl_str_mv Tesis/Trabajo de grado - Monografía - Pregrado
format http://purl.org/coar/resource_type/c_7a1f
status_str draft
dc.identifier.citation.spa.fl_str_mv Caro Serna, R. D. (2018). Implementation of a dna compression algorithm using dataflow computing (Trabajo de grado de pregrado). Universidad de Antioquia. Medellín, Colombia.
dc.identifier.uri.none.fl_str_mv http://hdl.handle.net/10495/13217
identifier_str_mv Caro Serna, R. D. (2018). Implementation of a dna compression algorithm using dataflow computing (Trabajo de grado de pregrado). Universidad de Antioquia. Medellín, Colombia.
url http://hdl.handle.net/10495/13217
dc.language.iso.spa.fl_str_mv eng
language eng
dc.rights.*.fl_str_mv Atribución-NoComercial-SinDerivadas 2.5 Colombia
dc.rights.spa.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.uri.*.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/2.5/co/
dc.rights.accessrights.spa.fl_str_mv http://purl.org/coar/access_right/c_abf2
dc.rights.creativecommons.spa.fl_str_mv https://creativecommons.org/licenses/by-nc-nd/4.0/
rights_invalid_str_mv Atribución-NoComercial-SinDerivadas 2.5 Colombia
http://creativecommons.org/licenses/by-nc-nd/2.5/co/
http://purl.org/coar/access_right/c_abf2
https://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv openAccess
dc.format.extent.spa.fl_str_mv 38
dc.format.mimetype.spa.fl_str_mv application/pdf
dc.publisher.place.spa.fl_str_mv Medellín, Colombia
institution Universidad de Antioquia
bitstream.url.fl_str_mv http://bibliotecadigital.udea.edu.co/bitstream/10495/13217/1/Rub%c3%a9nCaro_2018_PIE12626.pdf
http://bibliotecadigital.udea.edu.co/bitstream/10495/13217/2/license_url
http://bibliotecadigital.udea.edu.co/bitstream/10495/13217/3/license_text
http://bibliotecadigital.udea.edu.co/bitstream/10495/13217/4/license_rdf
http://bibliotecadigital.udea.edu.co/bitstream/10495/13217/5/license.txt
bitstream.checksum.fl_str_mv b0e60b6a87069c19df9263125c5e0124
4afdbb8c545fd630ea7db775da747b2f
d41d8cd98f00b204e9800998ecf8427e
d41d8cd98f00b204e9800998ecf8427e
8a4605be74aa9ea9d79846c1fba20a33
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional Universidad de Antioquia
repository.mail.fl_str_mv andres.perez@udea.edu.co
_version_ 1812173119001985024
spelling Isaza Ramírez, SebastiánCaro Serna, Rubén David2020-01-14T23:29:00Z2020-01-14T23:29:00Z2018Caro Serna, R. D. (2018). Implementation of a dna compression algorithm using dataflow computing (Trabajo de grado de pregrado). Universidad de Antioquia. Medellín, Colombia.http://hdl.handle.net/10495/13217RESUMEN: The amount of DNA sequences databases has increased a lot in the last years, the amount of space required to store the sequences is increasing more than the space available to store them, that means a higher cost to store DNA sequences and also the read sequences which are fragments of the whole sequence. This situation has led to the use of compression algorithms for storing DNA files. The main objective of the project is to increase the efficiency of the compression of DNA sequences because the process requires a lot of compute. An FPGA with dataflow architecture has been used to develop the project with the aim of exploiting the available parallelism in the algorithm chosen. The compression method has been developed to process sequence reads with a fixed amount of mutations per read and the test has been developed for 4, 8, 12 and 16 mutations per reads using an architecture that allows up to 160 reads to be processed in only one thick. Experimental results showed that even with a low amount of processing units, the performance increases a lot using the DFE architecture, the only disadvantage is the store/reading time. Palabras claves : Compression, Dataflow Engine (DFE), FPGA, CPU, DNA, Maxeler38application/pdfenginfo:eu-repo/semantics/draftinfo:eu-repo/semantics/bachelorThesishttp://purl.org/coar/resource_type/c_7a1fhttps://purl.org/redcol/resource_type/TPTesis/Trabajo de grado - Monografía - Pregradohttp://purl.org/coar/version/c_b1a7d7d4d402bcceAtribución-NoComercial-SinDerivadas 2.5 Colombiainfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-nd/2.5/co/http://purl.org/coar/access_right/c_abf2https://creativecommons.org/licenses/by-nc-nd/4.0/CompressionDataflow Engine (DFE)FPGAMaxelerImplementation of a dna compression algorithm using dataflow computingMedellín, ColombiaIngeniero ElectrónicoPregradoFacultad de Ingeniería. Carrera de Ingeniería ElectrónicaUniversidad de AntioquiaORIGINALRubénCaro_2018_PIE12626.pdfRubénCaro_2018_PIE12626.pdfTrabajo de grado de pregradoapplication/pdf3600929http://bibliotecadigital.udea.edu.co/bitstream/10495/13217/1/Rub%c3%a9nCaro_2018_PIE12626.pdfb0e60b6a87069c19df9263125c5e0124MD51CC-LICENSElicense_urllicense_urltext/plain; charset=utf-849http://bibliotecadigital.udea.edu.co/bitstream/10495/13217/2/license_url4afdbb8c545fd630ea7db775da747b2fMD52license_textlicense_texttext/html; charset=utf-80http://bibliotecadigital.udea.edu.co/bitstream/10495/13217/3/license_textd41d8cd98f00b204e9800998ecf8427eMD53license_rdflicense_rdfapplication/rdf+xml; charset=utf-80http://bibliotecadigital.udea.edu.co/bitstream/10495/13217/4/license_rdfd41d8cd98f00b204e9800998ecf8427eMD54LICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://bibliotecadigital.udea.edu.co/bitstream/10495/13217/5/license.txt8a4605be74aa9ea9d79846c1fba20a33MD5510495/13217oai:bibliotecadigital.udea.edu.co:10495/132172021-06-19 19:19:54.273Repositorio Institucional Universidad de Antioquiaandres.perez@udea.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=