Optimización de algoritmos para computación científica sobre arquitecturas heterogéneas

Digital

Autores:: Castaño Londoño, Luis Fernando

Tipo de recurso:: Doctoral thesis

Fecha de publicación:: 2021

Institución:: Universidad Nacional de Colombia

Repositorio:: Universidad Nacional de Colombia

Idioma:: spa

id	UNACIONAL2_036234c1e3cdeae12752707caad11ae7
oai_identifier_str	oai:repositorio.unal.edu.co:unal/80466
network_acronym_str	UNACIONAL2
network_name_str	Universidad Nacional de Colombia
repository_id_str
dc.title.spa.fl_str_mv	Optimización de algoritmos para computación científica sobre arquitecturas heterogéneas
dc.title.translated.eng.fl_str_mv	Algorithm optimization for scientific computing on heterogeneous architectures
title	Optimización de algoritmos para computación científica sobre arquitecturas heterogéneas
spellingShingle	Optimización de algoritmos para computación científica sobre arquitecturas heterogéneas 000 - Ciencias de la computación, información y obras generales Computación heterogénea FPGA Computación con esténcil Ecuación de calor Ecuación de Laplace Síntesis de alto nivel Programación informática
title_short	Optimización de algoritmos para computación científica sobre arquitecturas heterogéneas
title_full	Optimización de algoritmos para computación científica sobre arquitecturas heterogéneas
title_fullStr	Optimización de algoritmos para computación científica sobre arquitecturas heterogéneas
title_full_unstemmed	Optimización de algoritmos para computación científica sobre arquitecturas heterogéneas
title_sort	Optimización de algoritmos para computación científica sobre arquitecturas heterogéneas
dc.creator.fl_str_mv	Castaño Londoño, Luis Fernando
dc.contributor.advisor.none.fl_str_mv	Osorio Londoño, Gustavo Adolfo
dc.contributor.author.none.fl_str_mv	Castaño Londoño, Luis Fernando
dc.contributor.researchgroup.spa.fl_str_mv	Percepción y Control Inteligente (PCI)
dc.subject.ddc.spa.fl_str_mv	000 - Ciencias de la computación, información y obras generales
topic	000 - Ciencias de la computación, información y obras generales Computación heterogénea FPGA Computación con esténcil Ecuación de calor Ecuación de Laplace Síntesis de alto nivel Programación informática
dc.subject.proposal.spa.fl_str_mv	Computación heterogénea FPGA Computación con esténcil Ecuación de calor Ecuación de Laplace Síntesis de alto nivel
dc.subject.unesco.none.fl_str_mv	Programación informática
description	Digital
publishDate	2021
dc.date.accessioned.none.fl_str_mv	2021-10-08T22:38:18Z
dc.date.available.none.fl_str_mv	2021-10-08T22:38:18Z
dc.date.issued.none.fl_str_mv	2021-09
dc.type.spa.fl_str_mv	Trabajo de grado - Doctorado
dc.type.driver.spa.fl_str_mv	info:eu-repo/semantics/doctoralThesis
dc.type.version.spa.fl_str_mv	info:eu-repo/semantics/acceptedVersion
dc.type.coar.spa.fl_str_mv	http://purl.org/coar/resource_type/c_db06
dc.type.content.spa.fl_str_mv	Text
format	http://purl.org/coar/resource_type/c_db06
status_str	acceptedVersion
dc.identifier.uri.none.fl_str_mv	https://repositorio.unal.edu.co/handle/unal/80466
dc.identifier.instname.spa.fl_str_mv	Universidad Nacional de Colombia
dc.identifier.reponame.spa.fl_str_mv	Repositorio Institucional Universidad Nacional de Colombia
dc.identifier.repourl.spa.fl_str_mv	https://repositorio.unal.edu.co/
url	https://repositorio.unal.edu.co/handle/unal/80466 https://repositorio.unal.edu.co/
identifier_str_mv	Universidad Nacional de Colombia Repositorio Institucional Universidad Nacional de Colombia
dc.language.iso.spa.fl_str_mv	spa
language	spa
dc.relation.references.spa.fl_str_mv	Bandishti, V. ; Pananilath, I. ; Bondhugula, U.: Tiling Stencil Computations to Maximize Parallelism. En: Proceedings of the IEEE International Conference for High Performance Computing, Networking, Storage and Analysis IEEE, 2012, p. 1–11 Beauchamp, Michael J. ; Hauck, Scott ; Underwood, Keith D. ; Hemmert, K S.: Architectural modifications to enhance the floating-point performance of FPGAs. En: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 16 (2008), Nr. 2, p. 177–187 Belanović, Pavle ; Leeser, Miriam: A library of parameterized floating-point modules and their use. En: International Conference on Field Programmable Logic and Applications Springer, 2002, p. 657–666 Brodtkorb, A. R. ; Dyken, C. ; Hagen, T. R. ; Hjelmervik, J. M. ; Storaasli, O. O.: State-of-the-art in heterogeneous computing. En: Scientific Programming, IOS Press Amsterdam 18 (2010), p. 1–33 Caffarena, Gabriel ; López, Juan A. ; Leyva, Gerardo ; Carreras, Carlos ; Nieto- Taladriz, Octavio: Architectural synthesis of fixed-point DSP datapaths using fpgas. En: International Journal of Reconfigurable Computing 2009 (2009), p. 8 Cattaneo, Riccardo ; Natale, Giuseppe ; Sicignano, Carlo ; Sciuto, Donatella ; Santambrogio, Marco D.: On how to accelerate iterative stencil loops: a scalable streaming-based approach. En: ACM Transactions on Architecture and Code Optimi- zation (TACO) 12 (2016), Nr. 4, p. 53 Cecilia, J. M. ; Abellán, J. L. ; Fernández, J. ; Acacio, M. E. ; Garc´ıa, J. M. ; Ujaldón, M.: Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE. En: The Journal of Supercomputing, Springer Science+Business Media 62 (2012), Nr. 2, p. 787–803 Chong, Yee J. ; Parameswaran, Sri: Configurable multimode embedded floatingpoint units for FPGAs. En: IEEE transactions on very large scale integration (VLSI) systems 19 (2011), Nr. 11, p. 2033–2044 Chugh, Nitin ; Vasista, Vinay ; Purini, Suresh ; Bondhugula, Uday: A DSL compiler for accelerating image processing pipelines on FPGAs. En: Parallel Architecture and Compilation Techniques (PACT), 2016 International Conference on IEEE, 2016, p. 327–338 Cong, Jason ; Li, Peng ; Xiao, Bingjun ; Zhang, Peng: An optimal microarchitecture for stencil computation acceleration based on non-uniform partitioning of data reuse buffers. En: Proceedings of the 51st annual design automation conference ACM, 2014, p. 1–6 Datta, Kaushik ; Kamil, Shoaib ;Williams, Samuel ; Oliker, Leonid ; Shalf, John ; Yelick, Katherine: Optimization and performance modeling of stencil computations on modern microprocessors. En: SIAM review 51 (2009), Nr. 1, p. 129–159 Deest, Gaël ; Estibals, Nicolas ; Yuki, Tomofumi ; Derrien, Steven ; Rajopadhye, Sanjay: Towards Scalable and Efficient FPGA Stencil Accelerators. En: 6th Internatio- nal Workshop on Polyhedral Compilation Techniques - IMPACT’16, 2016 Deest, Gaël ; Yuki, Tomofumi ; Rajopadhye, Sanjay ; Derrien, Steven: One size does not fit all: Implementation trade-offs for iterative stencil computations on FPGAs. En: Field Programmable Logic and Applications (FPL), 2017 27th International Conference on IEEE, 2017, p. 1–8 Deschamps, Jean-Pierre ; Bioul, Gery J. ; Sutter, Gustavo D.: Synthesis of arith- metic circuits: FPGA, ASIC and embedded systems. John Wiley & Sons, 2006 Detrey, Jérémie ; de Dinechin, Florent: Parameterized floating-point logarithm and exponential functions for FPGAs. En: Microprocessors and Microsystems 31 (2007), Nr. 8, p. 537–545 Dido, Jérôme ; Geraudie, Nicolas ; Loiseau, Ludovic ; Payeur, Olivier ; Savaria, Yvon ; Poirier, Daniel: A flexible floating-point format for optimizing data-paths and operators in FPGA based DSPs. En: Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays ACM, 2002, p. 50–55 de Dinechin, Florent ; Detrey, Jérémie ; Cret¸, Octavian ; Tudoran, Radu: When FPGAs are better at floating-point than microprocessors. (2007) Dursun, Hikmet ; Nomura, Ken-Ichi ; Peng, Liu ; Seymour, Richard ; Wang, Weiqiang ; Kalia, Rajiv K. ; Nakano, Aiichiro ; Vashishta, Priya: A multilevel parallelization framework for high-order stencil computations. En: European Conference on Parallel Processing Springer, 2009, p. 642–653 Echeverría, Pedro ; López-Vallejo, Marisa: Customizing floating-point units for FPGAs: Area-performance-standard trade-offs. En: Microprocessors and Microsystems 35 (2011), Nr. 6, p. 535–546 Escobedo, Juan ; Lin, Mingjie: Graph-Theoretically Optimal Memory Banking for Stencil-Based Computing Kernels. En: Proceedings of the 2018 ACM/SIGDA Interna- tional Symposium on Field-Programmable Gate Arrays ACM, 2018, p. 199–208 de Fine Licht, Johannes ; Blott, Michaela ; Hoefler, Torsten: Designing scalable FPGA architectures using high-level synthesis. En: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’18) Vol. 53 ACM, 2018, p. 403–404 Fu, Haohuan ; Osborne,William ; Clapp, Robert G. ;Mencer, Oskar ; Luk,Wayne: Accelerating seismic computations using customized number representations on FPGAs. En: EURASIP Journal on Embedded Systems 2009 (2009), p. 3 Ho, Chun H. ; Leong, Monk-Ping ; Leong, Philip Heng W. ; Becker, J¨urgen ; Glesner, Manfred: Rapid prototyping of FPGA based floating point DSP systems. En: Rapid System Prototyping, 2002. Proceedings. 13th IEEE International Workshop on IEEE, 2002, p. 19–24 Ho, Chun H. ; Yu, Chi W. ; Leong, Philip ; Luk, Wayne ; Wilton, Steven J.: Floating-point FPGA: architecture and modeling. En: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 17 (2009), Nr. 12, p. 1709–1718 Hockert, Neil ; Compton, Katherine: Improving floating-point performance in less area: Fractured floating point units (FFPUs). En: Journal of Signal Processing Systems 67 (2012), Nr. 1, p. 31–46 Kobayashi, R. ; Takamaeda-Yamazaki, S. ; Kise, K.: Towards a Low-Power Accelerator of Many FPGAs for Stencil Computations. En: Proceedings of the IEEE Third International Conference on Networking and Computing IEEE, 2012, p. 343–349 Kobayashi, Ryohei ; Oobata, Yuma ; Fujita, Norihisa ; Yamaguchi, Yoshiki ; Boku, Taisuke: OpenCL-ready High Speed FPGA Network for Reconfigurable High Performance Computing. En: Proceedings of the International Conference on High Per- formance Computing in Asia-Pacific Region ACM, 2018, p. 192–201 László, Endre ; Nagy, Zoltán ; Giles, Michael B. ; Reguly, István ; Appleyard, Jeremy ; Szolgay, Peter: Analysis of parallel processor architectures for the solution of the Black-Scholes PDE. En: Circuits and Systems (ISCAS), 2015 IEEE International Symposium on IEEE, 2015, p. 1977–1980 Liu, Junyi ; Bayliss, Samuel ; Constantinides, George A.: Offline synthesis of online dependence testing: Parametric loop pipelining for HLS. En: Field-Programmable Cus- tom Computing Machines (FCCM), 2015 IEEE 23rd Annual International Symposium on IEEE, 2015, p. 159–162 Liu, Junyi ; Wickerson, John ; Bayliss, Samuel ; Constantinides, George A.: Polyhedral-based Dynamic Loop Pipelining for High-Level Synthesis. En: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2017) Liu, Junyi ; Wickerson, John ; Constantinides, George A.: Loop splitting for efficient pipelining in high-level synthesis. En: 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) IEEE, 2016, p. 72–79 Mokhov, Andrey ; De Gennaro, Alessandro ; Tarawneh, Ghaith ; Wray, Jonny ; Lukyanov, Georgy ; Mileiko, Sergey ; Scott, Joe ; Yakovlev, Alex ; Brown, Andrew: Language and hardware acceleration backend for graph processing. En: Spe- cification and Design Languages (FDL), 2017 Forum on IEEE, 2017, p. 1–7 Mondigo, Antoniette ; Ueno, Tomohiro ; Tanaka, Daichi ; Sano, Kentaro ; Yamamoto, Satoru: Design and scalability analysis of bandwidth-compressed stream computing with multiple FPGAs. En: Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2017 12th International Symposium on IEEE, 2017, p. 1–8 Muranushi, Takayuki ; Makino, Junichiro: Optimal temporal blocking for stencil computation. En: Procedia Computer Science 51 (2015), p. 1303–1312 Nacci, Alessandro A. ; Rana, Vincenzo ; Bruschi, Francesco ; Sciuto, Donatella ; Beretta, Ivan ; Atienza, David: A high-level synthesis flow for the implementation of iterative stencil loop algorithms on FPGA devices. En: Proceedings of the 50th annual design automation conference ACM, 2013, p. 52 Natale, Giuseppe ; Stramondo, Giulio ; Bressana, Pietro ; Cattaneo, Riccardo ; Sciuto, Donatella ; Santambrogio, Marco D.: A polyhedral model-based framework for dataflow implementation on FPGA devices of iterative stencil loops. En: Computer- Aided Design (ICCAD), 2016 IEEE/ACM International Conference on IEEE, 2016, p. 1–8 de Oliveira, Cristiano B. ; Cardoso, Joao M. ; Marques, Eduardo: High-level synthesis from C vs. a DSL-based approach. En: Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International IEEE, 2014, p. 257–262 Peng, Liu ; Seymour, Richard ; Nomura, Ken-ichi ; Kalia, Rajiv K. ; Nakano, Aiichiro ; Vashishta, Priya ; Loddoch, Alexander ; Netzband, Michael ; Volz, William R. ; Wong, Chap C.: High-order stencil computations on multicore clusters. En: Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on IEEE, 2009, p. 1–11 Reagen, Brandon ; Adolf, Robert ; Shao, Yakun S. ; Wei, Gu-Yeon ; Brooks, David: Machsuite: Benchmarks for accelerator design and customized architectures. En: Workload Characterization (IISWC), 2014 IEEE International Symposium on IEEE, 2014, p. 110–119 Reiche, Oliver ; ¨ Ozkan, M A. ; Hannig, Frank ; Teich, J¨urgen ; Schmid, Moritz: Loop parallelization techniques for fpga accelerator synthesis. En: Journal of Signal Processing Systems 90 (2018), Nr. 1, p. 3–27 Rocher, Romuald ; Menard, Daniel ; Herve, Nicolas ; Sentieys, Olivier: Fixedpoint configurable hardware components. En: EURASIP Journal on Embedded Systems 2006 (2006), Nr. 1, p. 023197 Sakai, Ryotaro ; Sugimoto, Naru ; Miyajima, Takaaki ; Fujita, Naoyuki ; Amano, Hideharu: Acceleration of full-pic simulation on a cpu-fpga tightly coupled environment. En: Embedded Multicore/Many-core Systems-on-Chip (MCSoC), 2016 IEEE 10th Inter- national Symposium on IEEE, 2016, p. 8–14 Sano, K. ; Hatsuda, Y. ; Yamamoto, S.: Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory-Bandwidth. En: IEEE Transactions on Parallel and Distributed Systems 25 (2014), March, Nr. 3, p. 695–705 Sano, K. ; Luzhou, W. ; Hatsuda, Y. ; Yamamoto, S.: Scalable FPGA-Array for High-Performance and Power-Efficient Computation Based on Difference Schemes. En: Proceedings of the Second International Workshop on High-Performance Reconfigurable Computing Technology and Applications IEEE, 2008, p. 1–9 Sano, Kentaro: FPGA-based systolic computational-memory array for scalable stencil computations. En: High-Performance Computing Using FPGAs. Springer, 2013, p. 279–303 Schmid, Moritz ; Reiche, Oliver ; Schmitt, Christian ; Hannig, Frank ; Teich, J¨urgen: Code generation for high-level synthesis of multiresolution applications on fpgas. En: arXiv preprint arXiv:1408.4721 (2014) Schmitt, Christian ; Schmid, Moritz ; Kuckuk, Sebastian ; K¨ostler, Harald ; Teich, J¨urgen ; Hannig, Frank: Reconfigurable Hardware Generation of Multigrid Solvers with Conjugate Gradient Coarse-Grid Solution. En: Parallel Processing Letters 28 (2018), Nr. 04, p. 1850016 Shao, Yakun S. ; Reagen, Brandon ;Wei, Gu-Yeon ; Brooks, David: Aladdin: A pre- RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures. En: ACM SIGARCH Computer Architecture News Vol. 42 IEEE Press, 2014, p. 97–108 Shen, Chongfei ; Liu, Hongtao ; Xie, XB ; Luk, Keith D. ; Hu, Yong: Selection of floating-point or fixed-point for adaptive noise canceller in somatosensory evoked potential measurement. En: Engineering in Medicine and Biology Society, 2007. EMBS 2007. 29th Annual International Conference of the IEEE IEEE, 2007, p. 3274–3277 del Sozzo, Emanuele ; Baghdadi, Riyadh ; Amarasinghe, Saman ; Santambrogio, Marco D.: A Common Backend for Hardware Acceleration on FPGA. En: Com- puter Design (ICCD), 2017 IEEE International Conference on IEEE, 2017, p. 427–430 Strenski, Dave ; Simkins, Jim ; Walke, Richard ; Wittig, Ralph: Evaluating fpgas for floating-point performance. En: High-Performance Reconfigurable Computing Technology and Applications, 2008. HPRCTA 2008. Second International Workshop on IEEE, 2008, p. 1–6 Strzodka, R. ; Shaheen, M. ; Pajak, D. ; Seidel, H.: Cache oblivious parallelograms in iterative stencil computations. En: Proceedings of the 24th ACM International Conference on Supercomputing ACM, 2010, p. 49–59 Strzodka, R. ; Shaheen, M. ; Pajak, D. ; Seidel, H.: Cache Accurate Time Skewing in Iterative Stencil Computations. En: Proceedings of the IEEE International Conference on Parallel Processing IEEE, 2011, p. 571–581 Tang, Yuan ; Chowdhury, Rezaul A. ; Kuszmaul, Bradley C. ; Luk, Chi-Keung ; Leiserson, Charles E.: The pochoir stencil compiler. En: Proceedings of the twenty- third annual ACM symposium on Parallelism in algorithms and architectures ACM, 2011, p. 117–128 Te Ewe, Chun ; Cheung, Peter Y. ; Constantinides, George A.: Dual fixed-point: An efficient alternative to floating-point computation. En: International Conference on Field Programmable Logic and Applications Springer, 2004, p. 200–208 Usui, T. ; Kobayashi, R. ; Kise, K.: A Challenge of Portable and High-Speed FPGA Accelerator. En: Proceedings of the 11th International Symposium on Applied Reconfi- gurable Computing, ARC 2015, 2015, p. 383–392 Vera, G A. ; Pattichis, Marios ; Lyke, James: A dynamic dual fixed-point arithmetic architecture for FPGAs. En: International Journal of Reconfigurable Computing 2011 (2011) Waidyasooriya, Hasitha M. ; Endo, Tsukasa ; Hariyama, Masanori ; Ohtera, Yasuo: OpenCL-Based FPGA Accelerator for 3D FDTD with Periodic and Absorbing Boundary Conditions. En: International Journal of Reconfigurable Computing 2017 (2017) Waidyasooriya, Hasitha M. ; Takei, Yasuhiro ; Tatsumi, Shunsuke ; Hariyama, Masanori: OpenCL-based FPGA-platform for stencil computation and its optimization methodology. En: IEEE Transactions on Parallel and Distributed Systems 28 (2017), Nr. 5, p. 1390–1402 Wang, Shuo ; Liang, Yun: A comprehensive framework for synthesizing stencil algorithms on FPGAs using OpenCL model. En: Design Automation Conference (DAC), 2017 54th ACM/EDAC/IEEE IEEE, 2017, p. 1–6 Williams, Samuel ;Waterman, Andrew ; Patterson, David: Roofline: an insightful visual performance model for multicore architectures. En: Communications of the ACM 52 (2009), Nr. 4, p. 65–76 Yu, Chi W. ; Lamoureux, Julien ; Wilton, Steven J. ; Leong, Philip H. ; Luk, Wayne: The Coarse-Grained/Fine-Grained Logic Interface in FPGAs with Embedded Floating-Point Arithmetic Units. En: International Journal of Reconfigurable Compu- ting 2008 (2008) Zohouri, Hamid R. ; Podobas, Artur ; Matsuoka, Satoshi: Combined spatial and temporal blocking for high-performance stencil computation on FPGAs using OpenCL. En: Proceedings of the 2018 ACM/SIGDA International Symposium on Field- Programmable Gate Arrays ACM, 2018, p. 153–162
dc.rights.coar.fl_str_mv	http://purl.org/coar/access_right/c_abf2
dc.rights.license.spa.fl_str_mv	Reconocimiento 4.0 Internacional
dc.rights.uri.spa.fl_str_mv	http://creativecommons.org/licenses/by/4.0/
dc.rights.accessrights.spa.fl_str_mv	info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Reconocimiento 4.0 Internacional http://creativecommons.org/licenses/by/4.0/ http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv	openAccess
dc.format.extent.spa.fl_str_mv	ix, 146 páginas
dc.format.mimetype.spa.fl_str_mv	application/pdf
dc.publisher.spa.fl_str_mv	Universidad Nacional de Colombia
dc.publisher.program.spa.fl_str_mv	Manizales - Ingeniería y Arquitectura - Doctorado en Ingeniería - Automática
dc.publisher.department.spa.fl_str_mv	Departamento de Ingeniería Eléctrica y Electrónica
dc.publisher.faculty.spa.fl_str_mv	Facultad de Ingeniería y Arquitectura
dc.publisher.place.spa.fl_str_mv	Manizales, Colombia
dc.publisher.branch.spa.fl_str_mv	Universidad Nacional de Colombia - Sede Manizales
institution	Universidad Nacional de Colombia
bitstream.url.fl_str_mv	https://repositorio.unal.edu.co/bitstream/unal/80466/1/license.txt https://repositorio.unal.edu.co/bitstream/unal/80466/2/75100830.2021.pdf https://repositorio.unal.edu.co/bitstream/unal/80466/3/75100830.2021.pdf.jpg
bitstream.checksum.fl_str_mv	cccfe52f796b7c63423298c2d3365fc6 a91a7e389b8d4dff30fecbebc6f3d79f f9756dd008f296230f05160c0eeb928a
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio Institucional Universidad Nacional de Colombia
repository.mail.fl_str_mv	repositorio_nal@unal.edu.co
_version_	1814089542691454976
spelling	Reconocimiento 4.0 Internacionalhttp://creativecommons.org/licenses/by/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Osorio Londoño, Gustavo Adolfoa8459e28e318014da878eb8be713aa83Castaño Londoño, Luis Fernandodb7891f2b11f98d6c3fd21aff87684d6Percepción y Control Inteligente (PCI)2021-10-08T22:38:18Z2021-10-08T22:38:18Z2021-09https://repositorio.unal.edu.co/handle/unal/80466Universidad Nacional de ColombiaRepositorio Institucional Universidad Nacional de Colombiahttps://repositorio.unal.edu.co/DigitalUn esquema muy usado en la computación científica se conoce como computación con esténcil. Es el núcleo central de algoritmos de álgebra lineal, ecuaciones diferenciales parciales (EDP) y procesamiento de imágenes. Sin embargo, el desempeño de los algoritmos basados en esténcil, está limitado por la notable diferencia entre el máximo rendimiento de procesamiento y el máximo ancho de banda de memoria en los sistemas multinúcleo y unidades de computación gráfica (GPU). Por esta razón el estudio de métodos para su optimización ha sido de gran interés. Algunos métodos se basan en la optimización del empleo de memoria, sobre los cuales se han desarrollado diversos trabajos en sistemas basados en CPU y arquitecturas heterogéneas. Debido a que con estos métodos de optimización persisten limitaciones en el rendimiento, algunos autores han propuesto esquemas para sistemas basados en arreglos de compuertas programables en campo (FPGA). En esta tesis doctoral se presentan dos metodologías para la optimización de arquitecturas basadas en FPGA para la computación con esténcil. Para algunas arquitecturas el diseño se realiza a nivel de hardware con base en el modelo de Glushkov utilizando VHDL. En otros casos se realiza codiseño hardware/software utilizando herramientas de síntesis de alto nivel. Como casos de estudio se propone la implementación y evaluación de rendimiento de una arquitectura basada en esténcil para la aproximación a la solución de problemas de propagación de calor modelados con la ecuación de calor unidimensional y la ecuación de Laplace bidimensional. Se proponen transformaciones en las arquitecturas y códigos basados en esténcil para el mejoramiento del desempeño en la ejecución del algoritmo con relación a una implementación base. En el caso de implementación con la herramimenta de síntesis de alto nivel se definen parámetros asociados al tamaño del dominio de la solución y directivas de optimización, para la determinación del efecto en el desempeA scheme widely used in scientific computing is known as stencil computation. It is the central kernel of linear algebra algorithms, partial differential equations (PDE) and image processing. However, the performance of stencil-based algorithms is limited by the remarkable difference between maximum throughput and maximum memory bandwidth in multi-core systems and graphics computing units (GPUs). For this reason the study of methods for its optimization has been of great interest. Some methods are based on optimizing the use of memory, on which various jobs have been developed in CPU-based systems and heterogeneous architectures. Because these optimization methods persist with performance limitations, some authors have proposed schemes for systems based on programmable field gate arrays (FPGA). In this thesis, two methodologies for the optimization of FPGA-based architectures for stencil computing are presented. For some architectures the design is done at the hardware level based on the Glushkov model using VHDL. In other cases, hardware/software co-design is carried out using high-level synthesis tools. As a case study, the implementation and performance evaluation of a stencil-based architecture is proposed for the approximation to the solution of heat propagation problems modeled with the one-dimensional heat equation and the two-dimensional Laplace equationDoctoradoDoctor en Ingeniería - Ingeniería AutomáticaDiseño Electrónicoix, 146 páginasapplication/pdfspaUniversidad Nacional de ColombiaManizales - Ingeniería y Arquitectura - Doctorado en Ingeniería - AutomáticaDepartamento de Ingeniería Eléctrica y ElectrónicaFacultad de Ingeniería y ArquitecturaManizales, ColombiaUniversidad Nacional de Colombia - Sede Manizales000 - Ciencias de la computación, información y obras generalesComputación heterogéneaFPGAComputación con esténcilEcuación de calorEcuación de LaplaceSíntesis de alto nivelProgramación informáticaOptimización de algoritmos para computación científica sobre arquitecturas heterogéneasAlgorithm optimization for scientific computing on heterogeneous architecturesTrabajo de grado - Doctoradoinfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/acceptedVersionhttp://purl.org/coar/resource_type/c_db06TextBandishti, V. ; Pananilath, I. ; Bondhugula, U.: Tiling Stencil Computations to Maximize Parallelism. En: Proceedings of the IEEE International Conference for High Performance Computing, Networking, Storage and Analysis IEEE, 2012, p. 1–11Beauchamp, Michael J. ; Hauck, Scott ; Underwood, Keith D. ; Hemmert, K S.: Architectural modifications to enhance the floating-point performance of FPGAs. En: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 16 (2008), Nr. 2, p. 177–187Belanović, Pavle ; Leeser, Miriam: A library of parameterized floating-point modules and their use. En: International Conference on Field Programmable Logic and Applications Springer, 2002, p. 657–666Brodtkorb, A. R. ; Dyken, C. ; Hagen, T. R. ; Hjelmervik, J. M. ; Storaasli, O. O.: State-of-the-art in heterogeneous computing. En: Scientific Programming, IOS Press Amsterdam 18 (2010), p. 1–33Caffarena, Gabriel ; López, Juan A. ; Leyva, Gerardo ; Carreras, Carlos ; Nieto- Taladriz, Octavio: Architectural synthesis of fixed-point DSP datapaths using fpgas. En: International Journal of Reconfigurable Computing 2009 (2009), p. 8Cattaneo, Riccardo ; Natale, Giuseppe ; Sicignano, Carlo ; Sciuto, Donatella ; Santambrogio, Marco D.: On how to accelerate iterative stencil loops: a scalable streaming-based approach. En: ACM Transactions on Architecture and Code Optimi- zation (TACO) 12 (2016), Nr. 4, p. 53Cecilia, J. M. ; Abellán, J. L. ; Fernández, J. ; Acacio, M. E. ; Garc´ıa, J. M. ; Ujaldón, M.: Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE. En: The Journal of Supercomputing, Springer Science+Business Media 62 (2012), Nr. 2, p. 787–803Chong, Yee J. ; Parameswaran, Sri: Configurable multimode embedded floatingpoint units for FPGAs. En: IEEE transactions on very large scale integration (VLSI) systems 19 (2011), Nr. 11, p. 2033–2044Chugh, Nitin ; Vasista, Vinay ; Purini, Suresh ; Bondhugula, Uday: A DSL compiler for accelerating image processing pipelines on FPGAs. En: Parallel Architecture and Compilation Techniques (PACT), 2016 International Conference on IEEE, 2016, p. 327–338Cong, Jason ; Li, Peng ; Xiao, Bingjun ; Zhang, Peng: An optimal microarchitecture for stencil computation acceleration based on non-uniform partitioning of data reuse buffers. En: Proceedings of the 51st annual design automation conference ACM, 2014, p. 1–6Datta, Kaushik ; Kamil, Shoaib ;Williams, Samuel ; Oliker, Leonid ; Shalf, John ; Yelick, Katherine: Optimization and performance modeling of stencil computations on modern microprocessors. En: SIAM review 51 (2009), Nr. 1, p. 129–159Deest, Gaël ; Estibals, Nicolas ; Yuki, Tomofumi ; Derrien, Steven ; Rajopadhye, Sanjay: Towards Scalable and Efficient FPGA Stencil Accelerators. En: 6th Internatio- nal Workshop on Polyhedral Compilation Techniques - IMPACT’16, 2016Deest, Gaël ; Yuki, Tomofumi ; Rajopadhye, Sanjay ; Derrien, Steven: One size does not fit all: Implementation trade-offs for iterative stencil computations on FPGAs. En: Field Programmable Logic and Applications (FPL), 2017 27th International Conference on IEEE, 2017, p. 1–8Deschamps, Jean-Pierre ; Bioul, Gery J. ; Sutter, Gustavo D.: Synthesis of arith- metic circuits: FPGA, ASIC and embedded systems. John Wiley & Sons, 2006Detrey, Jérémie ; de Dinechin, Florent: Parameterized floating-point logarithm and exponential functions for FPGAs. En: Microprocessors and Microsystems 31 (2007), Nr. 8, p. 537–545Dido, Jérôme ; Geraudie, Nicolas ; Loiseau, Ludovic ; Payeur, Olivier ; Savaria, Yvon ; Poirier, Daniel: A flexible floating-point format for optimizing data-paths and operators in FPGA based DSPs. En: Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays ACM, 2002, p. 50–55de Dinechin, Florent ; Detrey, Jérémie ; Cret¸, Octavian ; Tudoran, Radu: When FPGAs are better at floating-point than microprocessors. (2007)Dursun, Hikmet ; Nomura, Ken-Ichi ; Peng, Liu ; Seymour, Richard ; Wang, Weiqiang ; Kalia, Rajiv K. ; Nakano, Aiichiro ; Vashishta, Priya: A multilevel parallelization framework for high-order stencil computations. En: European Conference on Parallel Processing Springer, 2009, p. 642–653Echeverría, Pedro ; López-Vallejo, Marisa: Customizing floating-point units for FPGAs: Area-performance-standard trade-offs. En: Microprocessors and Microsystems 35 (2011), Nr. 6, p. 535–546Escobedo, Juan ; Lin, Mingjie: Graph-Theoretically Optimal Memory Banking for Stencil-Based Computing Kernels. En: Proceedings of the 2018 ACM/SIGDA Interna- tional Symposium on Field-Programmable Gate Arrays ACM, 2018, p. 199–208de Fine Licht, Johannes ; Blott, Michaela ; Hoefler, Torsten: Designing scalable FPGA architectures using high-level synthesis. En: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’18) Vol. 53 ACM, 2018, p. 403–404Fu, Haohuan ; Osborne,William ; Clapp, Robert G. ;Mencer, Oskar ; Luk,Wayne: Accelerating seismic computations using customized number representations on FPGAs. En: EURASIP Journal on Embedded Systems 2009 (2009), p. 3Ho, Chun H. ; Leong, Monk-Ping ; Leong, Philip Heng W. ; Becker, J¨urgen ; Glesner, Manfred: Rapid prototyping of FPGA based floating point DSP systems. En: Rapid System Prototyping, 2002. Proceedings. 13th IEEE International Workshop on IEEE, 2002, p. 19–24Ho, Chun H. ; Yu, Chi W. ; Leong, Philip ; Luk, Wayne ; Wilton, Steven J.: Floating-point FPGA: architecture and modeling. En: IEEE Transactions on Very Large Scale Integration (VLSI) Systems 17 (2009), Nr. 12, p. 1709–1718Hockert, Neil ; Compton, Katherine: Improving floating-point performance in less area: Fractured floating point units (FFPUs). En: Journal of Signal Processing Systems 67 (2012), Nr. 1, p. 31–46Kobayashi, R. ; Takamaeda-Yamazaki, S. ; Kise, K.: Towards a Low-Power Accelerator of Many FPGAs for Stencil Computations. En: Proceedings of the IEEE Third International Conference on Networking and Computing IEEE, 2012, p. 343–349Kobayashi, Ryohei ; Oobata, Yuma ; Fujita, Norihisa ; Yamaguchi, Yoshiki ; Boku, Taisuke: OpenCL-ready High Speed FPGA Network for Reconfigurable High Performance Computing. En: Proceedings of the International Conference on High Per- formance Computing in Asia-Pacific Region ACM, 2018, p. 192–201László, Endre ; Nagy, Zoltán ; Giles, Michael B. ; Reguly, István ; Appleyard, Jeremy ; Szolgay, Peter: Analysis of parallel processor architectures for the solution of the Black-Scholes PDE. En: Circuits and Systems (ISCAS), 2015 IEEE International Symposium on IEEE, 2015, p. 1977–1980Liu, Junyi ; Bayliss, Samuel ; Constantinides, George A.: Offline synthesis of online dependence testing: Parametric loop pipelining for HLS. En: Field-Programmable Cus- tom Computing Machines (FCCM), 2015 IEEE 23rd Annual International Symposium on IEEE, 2015, p. 159–162Liu, Junyi ; Wickerson, John ; Bayliss, Samuel ; Constantinides, George A.: Polyhedral-based Dynamic Loop Pipelining for High-Level Synthesis. En: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2017)Liu, Junyi ; Wickerson, John ; Constantinides, George A.: Loop splitting for efficient pipelining in high-level synthesis. En: 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) IEEE, 2016, p. 72–79Mokhov, Andrey ; De Gennaro, Alessandro ; Tarawneh, Ghaith ; Wray, Jonny ; Lukyanov, Georgy ; Mileiko, Sergey ; Scott, Joe ; Yakovlev, Alex ; Brown, Andrew: Language and hardware acceleration backend for graph processing. En: Spe- cification and Design Languages (FDL), 2017 Forum on IEEE, 2017, p. 1–7Mondigo, Antoniette ; Ueno, Tomohiro ; Tanaka, Daichi ; Sano, Kentaro ; Yamamoto, Satoru: Design and scalability analysis of bandwidth-compressed stream computing with multiple FPGAs. En: Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2017 12th International Symposium on IEEE, 2017, p. 1–8Muranushi, Takayuki ; Makino, Junichiro: Optimal temporal blocking for stencil computation. En: Procedia Computer Science 51 (2015), p. 1303–1312Nacci, Alessandro A. ; Rana, Vincenzo ; Bruschi, Francesco ; Sciuto, Donatella ; Beretta, Ivan ; Atienza, David: A high-level synthesis flow for the implementation of iterative stencil loop algorithms on FPGA devices. En: Proceedings of the 50th annual design automation conference ACM, 2013, p. 52Natale, Giuseppe ; Stramondo, Giulio ; Bressana, Pietro ; Cattaneo, Riccardo ; Sciuto, Donatella ; Santambrogio, Marco D.: A polyhedral model-based framework for dataflow implementation on FPGA devices of iterative stencil loops. En: Computer- Aided Design (ICCAD), 2016 IEEE/ACM International Conference on IEEE, 2016, p. 1–8de Oliveira, Cristiano B. ; Cardoso, Joao M. ; Marques, Eduardo: High-level synthesis from C vs. a DSL-based approach. En: Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International IEEE, 2014, p. 257–262Peng, Liu ; Seymour, Richard ; Nomura, Ken-ichi ; Kalia, Rajiv K. ; Nakano, Aiichiro ; Vashishta, Priya ; Loddoch, Alexander ; Netzband, Michael ; Volz, William R. ; Wong, Chap C.: High-order stencil computations on multicore clusters. En: Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on IEEE, 2009, p. 1–11Reagen, Brandon ; Adolf, Robert ; Shao, Yakun S. ; Wei, Gu-Yeon ; Brooks, David: Machsuite: Benchmarks for accelerator design and customized architectures. En: Workload Characterization (IISWC), 2014 IEEE International Symposium on IEEE, 2014, p. 110–119Reiche, Oliver ; ¨ Ozkan, M A. ; Hannig, Frank ; Teich, J¨urgen ; Schmid, Moritz: Loop parallelization techniques for fpga accelerator synthesis. En: Journal of Signal Processing Systems 90 (2018), Nr. 1, p. 3–27Rocher, Romuald ; Menard, Daniel ; Herve, Nicolas ; Sentieys, Olivier: Fixedpoint configurable hardware components. En: EURASIP Journal on Embedded Systems 2006 (2006), Nr. 1, p. 023197Sakai, Ryotaro ; Sugimoto, Naru ; Miyajima, Takaaki ; Fujita, Naoyuki ; Amano, Hideharu: Acceleration of full-pic simulation on a cpu-fpga tightly coupled environment. En: Embedded Multicore/Many-core Systems-on-Chip (MCSoC), 2016 IEEE 10th Inter- national Symposium on IEEE, 2016, p. 8–14Sano, K. ; Hatsuda, Y. ; Yamamoto, S.: Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory-Bandwidth. En: IEEE Transactions on Parallel and Distributed Systems 25 (2014), March, Nr. 3, p. 695–705Sano, K. ; Luzhou, W. ; Hatsuda, Y. ; Yamamoto, S.: Scalable FPGA-Array for High-Performance and Power-Efficient Computation Based on Difference Schemes. En: Proceedings of the Second International Workshop on High-Performance Reconfigurable Computing Technology and Applications IEEE, 2008, p. 1–9Sano, Kentaro: FPGA-based systolic computational-memory array for scalable stencil computations. En: High-Performance Computing Using FPGAs. Springer, 2013, p. 279–303Schmid, Moritz ; Reiche, Oliver ; Schmitt, Christian ; Hannig, Frank ; Teich, J¨urgen: Code generation for high-level synthesis of multiresolution applications on fpgas. En: arXiv preprint arXiv:1408.4721 (2014)Schmitt, Christian ; Schmid, Moritz ; Kuckuk, Sebastian ; K¨ostler, Harald ; Teich, J¨urgen ; Hannig, Frank: Reconfigurable Hardware Generation of Multigrid Solvers with Conjugate Gradient Coarse-Grid Solution. En: Parallel Processing Letters 28 (2018), Nr. 04, p. 1850016Shao, Yakun S. ; Reagen, Brandon ;Wei, Gu-Yeon ; Brooks, David: Aladdin: A pre- RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures. En: ACM SIGARCH Computer Architecture News Vol. 42 IEEE Press, 2014, p. 97–108Shen, Chongfei ; Liu, Hongtao ; Xie, XB ; Luk, Keith D. ; Hu, Yong: Selection of floating-point or fixed-point for adaptive noise canceller in somatosensory evoked potential measurement. En: Engineering in Medicine and Biology Society, 2007. EMBS 2007. 29th Annual International Conference of the IEEE IEEE, 2007, p. 3274–3277del Sozzo, Emanuele ; Baghdadi, Riyadh ; Amarasinghe, Saman ; Santambrogio, Marco D.: A Common Backend for Hardware Acceleration on FPGA. En: Com- puter Design (ICCD), 2017 IEEE International Conference on IEEE, 2017, p. 427–430Strenski, Dave ; Simkins, Jim ; Walke, Richard ; Wittig, Ralph: Evaluating fpgas for floating-point performance. En: High-Performance Reconfigurable Computing Technology and Applications, 2008. HPRCTA 2008. Second International Workshop on IEEE, 2008, p. 1–6Strzodka, R. ; Shaheen, M. ; Pajak, D. ; Seidel, H.: Cache oblivious parallelograms in iterative stencil computations. En: Proceedings of the 24th ACM International Conference on Supercomputing ACM, 2010, p. 49–59Strzodka, R. ; Shaheen, M. ; Pajak, D. ; Seidel, H.: Cache Accurate Time Skewing in Iterative Stencil Computations. En: Proceedings of the IEEE International Conference on Parallel Processing IEEE, 2011, p. 571–581Tang, Yuan ; Chowdhury, Rezaul A. ; Kuszmaul, Bradley C. ; Luk, Chi-Keung ; Leiserson, Charles E.: The pochoir stencil compiler. En: Proceedings of the twenty- third annual ACM symposium on Parallelism in algorithms and architectures ACM, 2011, p. 117–128Te Ewe, Chun ; Cheung, Peter Y. ; Constantinides, George A.: Dual fixed-point: An efficient alternative to floating-point computation. En: International Conference on Field Programmable Logic and Applications Springer, 2004, p. 200–208Usui, T. ; Kobayashi, R. ; Kise, K.: A Challenge of Portable and High-Speed FPGA Accelerator. En: Proceedings of the 11th International Symposium on Applied Reconfi- gurable Computing, ARC 2015, 2015, p. 383–392Vera, G A. ; Pattichis, Marios ; Lyke, James: A dynamic dual fixed-point arithmetic architecture for FPGAs. En: International Journal of Reconfigurable Computing 2011 (2011)Waidyasooriya, Hasitha M. ; Endo, Tsukasa ; Hariyama, Masanori ; Ohtera, Yasuo: OpenCL-Based FPGA Accelerator for 3D FDTD with Periodic and Absorbing Boundary Conditions. En: International Journal of Reconfigurable Computing 2017 (2017)Waidyasooriya, Hasitha M. ; Takei, Yasuhiro ; Tatsumi, Shunsuke ; Hariyama, Masanori: OpenCL-based FPGA-platform for stencil computation and its optimization methodology. En: IEEE Transactions on Parallel and Distributed Systems 28 (2017), Nr. 5, p. 1390–1402Wang, Shuo ; Liang, Yun: A comprehensive framework for synthesizing stencil algorithms on FPGAs using OpenCL model. En: Design Automation Conference (DAC), 2017 54th ACM/EDAC/IEEE IEEE, 2017, p. 1–6Williams, Samuel ;Waterman, Andrew ; Patterson, David: Roofline: an insightful visual performance model for multicore architectures. En: Communications of the ACM 52 (2009), Nr. 4, p. 65–76Yu, Chi W. ; Lamoureux, Julien ; Wilton, Steven J. ; Leong, Philip H. ; Luk, Wayne: The Coarse-Grained/Fine-Grained Logic Interface in FPGAs with Embedded Floating-Point Arithmetic Units. En: International Journal of Reconfigurable Compu- ting 2008 (2008)Zohouri, Hamid R. ; Podobas, Artur ; Matsuoka, Satoshi: Combined spatial and temporal blocking for high-performance stencil computation on FPGAs using OpenCL. En: Proceedings of the 2018 ACM/SIGDA International Symposium on Field- Programmable Gate Arrays ACM, 2018, p. 153–162Beca Estudiante Sobresaliente de Posgrado (2012-2014)Universidad Nacional de ColombiaPúblico generalLICENSElicense.txtlicense.txttext/plain; charset=utf-83964https://repositorio.unal.edu.co/bitstream/unal/80466/1/license.txtcccfe52f796b7c63423298c2d3365fc6MD51ORIGINAL75100830.2021.pdf75100830.2021.pdfTesis de Doctorado en Ingeniería - Linea de Investigación en Automáticaapplication/pdf5833107https://repositorio.unal.edu.co/bitstream/unal/80466/2/75100830.2021.pdfa91a7e389b8d4dff30fecbebc6f3d79fMD52THUMBNAIL75100830.2021.pdf.jpg75100830.2021.pdf.jpgGenerated Thumbnailimage/jpeg4593https://repositorio.unal.edu.co/bitstream/unal/80466/3/75100830.2021.pdf.jpgf9756dd008f296230f05160c0eeb928aMD53unal/80466oai:repositorio.unal.edu.co:unal/804662023-07-29 23:04:05.531Repositorio Institucional Universidad Nacional de Colombiarepositorio_nal@unal.edu.coUExBTlRJTExBIERFUMOTU0lUTwoKQ29tbyBlZGl0b3IgZGUgZXN0ZSDDrXRlbSwgdXN0ZWQgcHVlZGUgbW92ZXJsbyBhIHJldmlzacOzbiBzaW4gYW50ZXMgcmVzb2x2ZXIgbG9zIHByb2JsZW1hcyBpZGVudGlmaWNhZG9zLCBkZSBsbyBjb250cmFyaW8sIGhhZ2EgY2xpYyBlbiBHdWFyZGFyIHBhcmEgZ3VhcmRhciBlbCDDrXRlbSB5IHNvbHVjaW9uYXIgZXN0b3MgcHJvYmxlbWFzIG1hcyB0YXJkZS4KCk5PVEFTOgoqU0kgTEEgVEVTSVMgQSBQVUJMSUNBUiBBRFFVSVJJw5MgQ09NUFJPTUlTT1MgREUgQ09ORklERU5DSUFMSURBRCBFTiBFTCBERVNBUlJPTExPIE8gUEFSVEVTIERFTCBET0NVTUVOVE8uIFNJR0EgTEEgRElSRUNUUklaIERFIExBIFJFU09MVUNJw5NOIDAyMyBERSAyMDE1LCBQT1IgTEEgQ1VBTCBTRSBFU1RBQkxFQ0UgRUwgUFJPQ0VESU1JRU5UTyBQQVJBIExBIFBVQkxJQ0FDScOTTiBERSBURVNJUyBERSBNQUVTVFLDjUEgWSBET0NUT1JBRE8gREUgTE9TIEVTVFVESUFOVEVTIERFIExBIFVOSVZFUlNJREFEIE5BQ0lPTkFMIERFIENPTE9NQklBIEVOIEVMIFJFUE9TSVRPUklPIElOU1RJVFVDSU9OQUwgVU4sIEVYUEVESURBIFBPUiBMQSBTRUNSRVRBUsONQSBHRU5FUkFMLgoqTEEgVEVTSVMgQSBQVUJMSUNBUiBERUJFIFNFUiBMQSBWRVJTScOTTiBGSU5BTCBBUFJPQkFEQS4KUGFyYSB0cmFiYWpvcyBkZXBvc2l0YWRvcyBwb3Igc3UgcHJvcGlvIGF1dG9yOiBBbCBhdXRvYXJjaGl2YXIgZXN0ZSBncnVwbyBkZSBhcmNoaXZvcyBkaWdpdGFsZXMgeSBzdXMgbWV0YWRhdG9zLCBZbyBnYXJhbnRpem8gYWwgUmVwb3NpdG9yaW8gSW5zdGl0dWNpb25hbCBVTiBlbCBkZXJlY2hvIGEgYWxtYWNlbmFybG9zIHkgbWFudGVuZXJsb3MgZGlzcG9uaWJsZXMgZW4gbMOtbmVhIGRlIG1hbmVyYSBncmF0dWl0YS4gRGVjbGFybyBxdWUgZGljaG8gbWF0ZXJpYWwgZXMgZGUgbWkgcHJvcGllZGFkIGludGVsZWN0dWFsIHkgcXVlIGVsIFJlcG9zaXRvcmlvIEluc3RpdHVjaW9uYWwgVU4gbm8gYXN1bWUgbmluZ3VuYSByZXNwb25zYWJpbGlkYWQgc2kgaGF5IGFsZ3VuYSB2aW9sYWNpw7NuIGEgbG9zIGRlcmVjaG9zIGRlIGF1dG9yIGFsIGRpc3RyaWJ1aXIgZXN0b3MgYXJjaGl2b3MgeSBtZXRhZGF0b3MuIChTZSByZWNvbWllbmRhIGEgdG9kb3MgbG9zIGF1dG9yZXMgYSBpbmRpY2FyIHN1cyBkZXJlY2hvcyBkZSBhdXRvciBlbiBsYSBww6FnaW5hIGRlIHTDrXR1bG8gZGUgc3UgZG9jdW1lbnRvLikgRGUgbGEgbWlzbWEgbWFuZXJhLCBhY2VwdG8gbG9zIHTDqXJtaW5vcyBkZSBsYSBzaWd1aWVudGUgbGljZW5jaWE6IExvcyBhdXRvcmVzIG8gdGl0dWxhcmVzIGRlbCBkZXJlY2hvIGRlIGF1dG9yIGRlbCBwcmVzZW50ZSBkb2N1bWVudG8gY29uZmllcmVuIGEgbGEgVW5pdmVyc2lkYWQgTmFjaW9uYWwgZGUgQ29sb21iaWEgdW5hIGxpY2VuY2lhIG5vIGV4Y2x1c2l2YSwgbGltaXRhZGEgeSBncmF0dWl0YSBzb2JyZSBsYSBvYnJhIHF1ZSBzZSBpbnRlZ3JhIGVuIGVsIFJlcG9zaXRvcmlvIEluc3RpdHVjaW9uYWwsIHF1ZSBzZSBhanVzdGEgYSBsYXMgc2lndWllbnRlcyBjYXJhY3RlcsOtc3RpY2FzOiBhKSBFc3RhcsOhIHZpZ2VudGUgYSBwYXJ0aXIgZGUgbGEgZmVjaGEgZW4gcXVlIHNlIGluY2x1eWUgZW4gZWwgcmVwb3NpdG9yaW8sIHF1ZSBzZXLDoW4gcHJvcnJvZ2FibGVzIGluZGVmaW5pZGFtZW50ZSBwb3IgZWwgdGllbXBvIHF1ZSBkdXJlIGVsIGRlcmVjaG8gcGF0cmltb25pYWwgZGVsIGF1dG9yLiBFbCBhdXRvciBwb2Ryw6EgZGFyIHBvciB0ZXJtaW5hZGEgbGEgbGljZW5jaWEgc29saWNpdMOhbmRvbG8gYSBsYSBVbml2ZXJzaWRhZC4gYikgTG9zIGF1dG9yZXMgYXV0b3JpemFuIGEgbGEgVW5pdmVyc2lkYWQgTmFjaW9uYWwgZGUgQ29sb21iaWEgcGFyYSBwdWJsaWNhciBsYSBvYnJhIGVuIGVsIGZvcm1hdG8gcXVlIGVsIHJlcG9zaXRvcmlvIGxvIHJlcXVpZXJhIChpbXByZXNvLCBkaWdpdGFsLCBlbGVjdHLDs25pY28gbyBjdWFscXVpZXIgb3RybyBjb25vY2lkbyBvIHBvciBjb25vY2VyKSB5IGNvbm9jZW4gcXVlIGRhZG8gcXVlIHNlIHB1YmxpY2EgZW4gSW50ZXJuZXQgcG9yIGVzdGUgaGVjaG8gY2lyY3VsYSBjb24gdW4gYWxjYW5jZSBtdW5kaWFsLiBjKSBMb3MgYXV0b3JlcyBhY2VwdGFuIHF1ZSBsYSBhdXRvcml6YWNpw7NuIHNlIGhhY2UgYSB0w610dWxvIGdyYXR1aXRvLCBwb3IgbG8gdGFudG8sIHJlbnVuY2lhbiBhIHJlY2liaXIgZW1vbHVtZW50byBhbGd1bm8gcG9yIGxhIHB1YmxpY2FjacOzbiwgZGlzdHJpYnVjacOzbiwgY29tdW5pY2FjacOzbiBww7pibGljYSB5IGN1YWxxdWllciBvdHJvIHVzbyBxdWUgc2UgaGFnYSBlbiBsb3MgdMOpcm1pbm9zIGRlIGxhIHByZXNlbnRlIGxpY2VuY2lhIHkgZGUgbGEgbGljZW5jaWEgQ3JlYXRpdmUgQ29tbW9ucyBjb24gcXVlIHNlIHB1YmxpY2EuIGQpIExvcyBhdXRvcmVzIG1hbmlmaWVzdGFuIHF1ZSBzZSB0cmF0YSBkZSB1bmEgb2JyYSBvcmlnaW5hbCBzb2JyZSBsYSBxdWUgdGllbmVuIGxvcyBkZXJlY2hvcyBxdWUgYXV0b3JpemFuIHkgcXVlIHNvbiBlbGxvcyBxdWllbmVzIGFzdW1lbiB0b3RhbCByZXNwb25zYWJpbGlkYWQgcG9yIGVsIGNvbnRlbmlkbyBkZSBzdSBvYnJhIGFudGUgbGEgVW5pdmVyc2lkYWQgTmFjaW9uYWwgeSBhbnRlIHRlcmNlcm9zLiBFbiB0b2RvIGNhc28gbGEgVW5pdmVyc2lkYWQgTmFjaW9uYWwgZGUgQ29sb21iaWEgc2UgY29tcHJvbWV0ZSBhIGluZGljYXIgc2llbXByZSBsYSBhdXRvcsOtYSBpbmNsdXllbmRvIGVsIG5vbWJyZSBkZWwgYXV0b3IgeSBsYSBmZWNoYSBkZSBwdWJsaWNhY2nDs24uIGUpIExvcyBhdXRvcmVzIGF1dG9yaXphbiBhIGxhIFVuaXZlcnNpZGFkIHBhcmEgaW5jbHVpciBsYSBvYnJhIGVuIGxvcyDDrW5kaWNlcyB5IGJ1c2NhZG9yZXMgcXVlIGVzdGltZW4gbmVjZXNhcmlvcyBwYXJhIHByb21vdmVyIHN1IGRpZnVzacOzbi4gZikgTG9zIGF1dG9yZXMgYWNlcHRhbiBxdWUgbGEgVW5pdmVyc2lkYWQgTmFjaW9uYWwgZGUgQ29sb21iaWEgcHVlZGEgY29udmVydGlyIGVsIGRvY3VtZW50byBhIGN1YWxxdWllciBtZWRpbyBvIGZvcm1hdG8gcGFyYSBwcm9ww7NzaXRvcyBkZSBwcmVzZXJ2YWNpw7NuIGRpZ2l0YWwuIFNJIEVMIERPQ1VNRU5UTyBTRSBCQVNBIEVOIFVOIFRSQUJBSk8gUVVFIEhBIFNJRE8gUEFUUk9DSU5BRE8gTyBBUE9ZQURPIFBPUiBVTkEgQUdFTkNJQSBPIFVOQSBPUkdBTklaQUNJw5NOLCBDT04gRVhDRVBDScOTTiBERSBMQSBVTklWRVJTSURBRCBOQUNJT05BTCBERSBDT0xPTUJJQSwgTE9TIEFVVE9SRVMgR0FSQU5USVpBTiBRVUUgU0UgSEEgQ1VNUExJRE8gQ09OIExPUyBERVJFQ0hPUyBZIE9CTElHQUNJT05FUyBSRVFVRVJJRE9TIFBPUiBFTCBSRVNQRUNUSVZPIENPTlRSQVRPIE8gQUNVRVJETy4KUGFyYSB0cmFiYWpvcyBkZXBvc2l0YWRvcyBwb3Igb3RyYXMgcGVyc29uYXMgZGlzdGludGFzIGEgc3UgYXV0b3I6IERlY2xhcm8gcXVlIGVsIGdydXBvIGRlIGFyY2hpdm9zIGRpZ2l0YWxlcyB5IG1ldGFkYXRvcyBhc29jaWFkb3MgcXVlIGVzdG95IGFyY2hpdmFuZG8gZW4gZWwgUmVwb3NpdG9yaW8gSW5zdGl0dWNpb25hbCBVTikgZXMgZGUgZG9taW5pbyBww7pibGljby4gU2kgbm8gZnVlc2UgZWwgY2FzbywgYWNlcHRvIHRvZGEgbGEgcmVzcG9uc2FiaWxpZGFkIHBvciBjdWFscXVpZXIgaW5mcmFjY2nDs24gZGUgZGVyZWNob3MgZGUgYXV0b3IgcXVlIGNvbmxsZXZlIGxhIGRpc3RyaWJ1Y2nDs24gZGUgZXN0b3MgYXJjaGl2b3MgeSBtZXRhZGF0b3MuCkFsIGhhY2VyIGNsaWMgZW4gZWwgc2lndWllbnRlIGJvdMOzbiwgdXN0ZWQgaW5kaWNhIHF1ZSBlc3TDoSBkZSBhY3VlcmRvIGNvbiBlc3RvcyB0w6lybWlub3MuCgpVTklWRVJTSURBRCBOQUNJT05BTCBERSBDT0xPTUJJQSAtIMOabHRpbWEgbW9kaWZpY2FjacOzbiAyNy8yMC8yMDIwCg==

Optimización de algoritmos para computación científica sobre arquitecturas heterogéneas

Publicaciones similares