Expansión de Ginkgo para administrar kernels reconfigurables basados en hardware

Aunque los sistemas heterogéneos basados en aceleradores hardware son un tópico de tendencia en la comunidad HPC, explorar las ventajas y desventajas de los basados en hardware reconfigurable en librerías de álgebra lineal para sistemas de alto rendimiento, no ha sido estudiado en profundidad. Por e...

Full description

Autores:: Morales Peña, Alejandro
Meneses, Esteban

Tipo de recurso:: Article of investigation

Fecha de publicación:: 2024

Institución:: Universidad Autónoma de Bucaramanga - UNAB

Repositorio:: Repositorio UNAB

Idioma:: spa

id	UNAB2_b3f1ff13f97d11f009ac8c77344393fd
oai_identifier_str	oai:repository.unab.edu.co:20.500.12749/28300
network_acronym_str	UNAB2
network_name_str	Repositorio UNAB
repository_id_str
dc.title.spa.fl_str_mv	Expansión de Ginkgo para administrar kernels reconfigurables basados en hardware
dc.title.translated.eng.fl_str_mv	Extending Ginkgo to Manage Reconfigurable Hardware-Based Kernels
title	Expansión de Ginkgo para administrar kernels reconfigurables basados en hardware
spellingShingle	Expansión de Ginkgo para administrar kernels reconfigurables basados en hardware Computación de Alto Rendimiento Ginkgo FPGAs SpMV HPC Ginkgo FPGAs SpMV
title_short	Expansión de Ginkgo para administrar kernels reconfigurables basados en hardware
title_full	Expansión de Ginkgo para administrar kernels reconfigurables basados en hardware
title_fullStr	Expansión de Ginkgo para administrar kernels reconfigurables basados en hardware
title_full_unstemmed	Expansión de Ginkgo para administrar kernels reconfigurables basados en hardware
title_sort	Expansión de Ginkgo para administrar kernels reconfigurables basados en hardware
dc.creator.fl_str_mv	Morales Peña, Alejandro Meneses, Esteban
dc.contributor.author.none.fl_str_mv	Morales Peña, Alejandro Meneses, Esteban
dc.contributor.orcid.spa.fl_str_mv	Morales Peña, Alejandro [0009-0004-9508-4266] Meneses, Esteban [0000-0002-4307-6000]
dc.subject.spa.fl_str_mv	Computación de Alto Rendimiento Ginkgo FPGAs SpMV
topic	Computación de Alto Rendimiento Ginkgo FPGAs SpMV HPC Ginkgo FPGAs SpMV
dc.subject.keywords.eng.fl_str_mv	HPC Ginkgo FPGAs SpMV
description	Aunque los sistemas heterogéneos basados en aceleradores hardware son un tópico de tendencia en la comunidad HPC, explorar las ventajas y desventajas de los basados en hardware reconfigurable en librerías de álgebra lineal para sistemas de alto rendimiento, no ha sido estudiado en profundidad. Por ello, en esta investigación, nuestro objetivo es aprovechar la capacidad de reconfiguración, adaptabilidad y reducción del consumo de energía de las FPGAs para generar kernels basados en FPGAs en Ginkgo, una librería especializada de álgebra lineal de alto rendimiento para sistemas multinúcleo. Generamos 3 kernels basados en FPGA para los formatos CSR, SELLP y SELL SpMV, y obtuvimos aumentos de velocidad de al menos 10 veces respecto a los kernels basados en CPU. Además, demostramos mediante un estudio de caracterización del rendimiento que las FPGA superan a los procesadores de propósito general en términos de tiempo de cálculo.
publishDate	2024
dc.date.issued.none.fl_str_mv	2024-06-18
dc.date.accessioned.none.fl_str_mv	2025-02-14T13:52:06Z
dc.date.available.none.fl_str_mv	2025-02-14T13:52:06Z
dc.type.coarversion.fl_str_mv	http://purl.org/coar/version/c_970fb48d4fbd8a85
dc.type.driver.none.fl_str_mv	info:eu-repo/semantics/article
dc.type.local.spa.fl_str_mv	Artículo
dc.type.coar.none.fl_str_mv	http://purl.org/coar/resource_type/c_2df8fbb1
dc.type.redcol.none.fl_str_mv	http://purl.org/redcol/resource_type/ART
format	http://purl.org/coar/resource_type/c_2df8fbb1
dc.identifier.issn.spa.fl_str_mv	1657-2831 2539-2115
dc.identifier.uri.none.fl_str_mv	http://hdl.handle.net/20.500.12749/28300
dc.identifier.instname.spa.fl_str_mv	instname:Universidad Autónoma de Bucaramanga UNAB
dc.identifier.repourl.spa.fl_str_mv	repourl:https://repository.unab.edu.co
dc.identifier.doi.none.fl_str_mv	https://doi.org/10.29375/25392115.5276
identifier_str_mv	1657-2831 2539-2115 instname:Universidad Autónoma de Bucaramanga UNAB repourl:https://repository.unab.edu.co
url	http://hdl.handle.net/20.500.12749/28300 https://doi.org/10.29375/25392115.5276
dc.language.iso.spa.fl_str_mv	spa
language	spa
dc.relation.spa.fl_str_mv	https://revistas.unab.edu.co/index.php/rcc/article/view/5276/4086
dc.relation.uri.spa.fl_str_mv	https://revistas.unab.edu.co/index.php/rcc/issue/view/303
dc.relation.references.none.fl_str_mv	AMD. (2022a, August 4). Heterogeneous Accelerated Compute Cluster (HACC) Program. (Advanced Micro Devices, Inc) Retrieved 2023, from AMD Website: https://www.amd.com/en/corporate/university-program/aup-hacc.html AMD. (2022b, October 7). XRT Native APIs. (Advanced Micro Devices, Inc) Retrieved 2023, from https://xilinx.github.io/XRT/master/html/xrt_native_apis.html AMD. (2023). ROCm Software 5.3.0: HIP Documentation. (Advanced Micro Devices, Inc) Retrieved 2024, from AMD website: https://rocm.docs.amd.com/projects/HIP/en/docs-5.3.0/index.html AMD. (2024, May 15). AMD. (Advanced Micro Devices, Inc) Retrieved 2024, from AMD Website: https://www.amd.com/en.html Anderson, E., Bai, Z., Bischof, C., Blackford, L. S., Demmel, J., Dongarra, J., . . . Sorensen, D. (1999). LAPACK Users' Guide (Third ed.). Philadelphia, USA: SIAM. doi:10.1137/1.9780898719604 Anzt, H., Cojean, T., Chen, Y.-C., Flegar, G., Göbel, F., Grützmacher, T., . . . Tsai, Y.-H. (2020). Ginkgo: A high performance numerical linear algebra library. Journal of Open Source Software, 5(52), 1-6, 2260. doi:10.21105/joss.02260 Anzt, H., Cojean, T., Flegar, G., Göbel, F., Grützmacher, T., Nayak, P., . . . Quintana-Ortí, E. S. (2022, March). Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing. (Z. Bai, & W. Bangerth, Eds.) ACM Transactions on Mathematical Software (TOMS), 48(1), 1-33, Article No. 2. doi:10.1145/3480935 Anzt, H., Tomov, S., & Dongarra, J. (2014, April). Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-formats on NVIDIA GPUs. Technical Report UT-EECS-14-727, University of Tennessee. Retrieved from https://icl.utk.edu/files/publications/2014/icl-utk-772- 2014.pdf Bosch, J., Tan, X., Filgueras, A., Vidal, M., Mateu, M., Jiménez-González, D., . . . Labarta, J. (2018). Application Acceleration on FPGAs with OmpSs@FPGA. In 2018 International Conference on Field-Programmable Technology (FPT), Naha, Japan, 10-14 Dec. (pp. 70-77). IEEE. doi:10.1109/FPT.2018.00021 BSC. (2016). Linear Algebra and Math Libraries. (Barcelona Supercomputing Center) Retrieved 2023, from BSC website: https://www.bsc.es/research-development/research-areas/programmingmodels/ linear-algebra-and-math-libraries Cppreference. (2024, October 4). RAII. Retrieved from Cppreference website: https://en.cppreference.com/w/cpp/language/raii Davis, T. A., & Hu, Y. (2011, November). The university of Florida sparse matrix collection. ACM Transactions on Mathematical Software, 38(1), 1-25, Article 1. doi:10.1145/2049662.2049663 De Matteis, T., de Fine Licht, J., & Hoefler, T. (2020). fBLAS: streaming linear algebra on FPGA. SC '20: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, November 9 - 19 (pp. 1-13, Article 59). Atlanta, Georgia, USA: IEEE. doi:10.5555/3433701.3433779 Dongarra, J. J., & Walker, D. W. (1995). Software Libraries for Linear Algebra Computations on High Performance Computers. SIAM Review, 37(2), 151-180. doi:10.1137/1037042 Dongarra, J., & Blackford, L. S. (1996). ScaLAPACK tutorial. In J. Waśniewski, J. Dongarra, K. Madsen, & D. Olesen, Applied Parallel Computing Industrial Computation and Optimization. Third International Workshop, PARA 1996, Lyngby, Denmark, August 18-21. Lecture Notes in Computer Science (Vol. 1184, pp. 204–215). Berlin, Heidelberg, Germany: Springer. doi:10.1007/3-540-62095-8_22 ETH Zürich. (2024). ETH Zürich. Retrieved from ETH website: https://ethz.ch/de.html Fang, J., Mulder, Y. B., Hidders, J., Lee, J., & Hofstee, H. P. (2020, January). In-memory database acceleration on FPGAs: a survey. The VLDB Journal, 29(1), 33–59. doi:10.1007/s00778-019- 00581-w Gao, Y., & Zhang, P. (2016). A Survey of Homogeneous and Heterogeneous System Architectures in High Performance Computing. 2016 IEEE International Conference on Smart Cloud (SmartCloud), 8-20 Nov. (pp. 170-175). New York, NY, USA: IEEE. doi:10.1109/SmartCloud.2016.36 Girden, E. R. (1992). ANOVA: repeated measures. Newbury Park, CA, USA: Sage, University Paper Serires on Quantitativer Aplications in the Social Sciences, Series 07-084. doi:10.4135/9781412983419 Gonzalez, J., & Núñez, R. C. (2009, July 1). LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators. Journal of Physics: Conference Series, SciDAC 2009, 14–18 June, 180(1, 012042). doi:10.1088/1742-6596/180/1/012042 Kestur, S., Davis, J. D., & Chung, E. S. (2012). Towards a Universal FPGA Matrix-Vector Multiplication Architecture. 2012 IEEE 20th International Symposium on Field- Programmable Custom Computing Machines, 29 April - 1 May (pp. 9-16). Toronto, ON, Canada: IEEE. doi:10.1109/FCCM.2012.12 Khronos Group. (2024). OPenCL: Open Standard for Parallel Programming of Heterogeneous Systems. Retrieved from Khronos Group website: https://www.khronos.org/opencl/ Khronos Group. (2024). SYCL: C++ Programming for Heterogeneous Parallel Computing. Retrieved from Khronos® Group website: https://www.khronos.org/api/index_2017/sycl Kuon, I., Tessier, R., & Rose, J. (2008). FPGA Architecture: Survey and Challenges. Foundations and Trends in Electronic Design Automation, 2(2), 135-253. doi:10.1561/1000000005 Lawson, C. L., Hanson, R. J., Kincaid, D. R., & Krogh, F. T. (1979, September). Basic Linear Algebra Subprograms for Fortran Usage. (J. R. Rice, Ed.) ACM Transactions on Mathematical Software (TOMS), 5(3), 308–323. doi:10.1145/355841.355847 NVIDIA Corporation. (2024). CUDA Toolkit. Retrieved from NVIDIA Developer website: https://developer.nvidia.com/cuda-toolkit OpenMP. (2024). OpenMP: The OpenMP API specification for parallel programming. Retrieved from OpenMP website: https://www.openmp.org/ Podobas, A. (2014). Accelerating Parallel Computations with OpenMP-Driven System-on-Chip Generation for FPGAs. Proceedings of the 2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs, MCSOC '14, September 23 - 25 (pp. 149-156). Washington, DC, USA: IEEE. doi:10.1109/MCSoC.2014.30 Sommer, L., Korinth, J., & Koch, A. (2017). OpenMP device offloading to FPGA accelerators. 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP), 10-12 July (pp. 201-205). Seattle, WA, USA: IEEE. doi:10.1109/ASAP.2017.7995280 Steffenel, L. A. (2019). HPC challenges for the next years: the rising of heterogeneity and its impact on simulations. CECAM Workshop: Microscopic simulations: forecasting the next two decades, April 24-26 (pp. 1-25). Toulouse, France: CECAM - Centre Européen de Calcul Atomique et Moléculaire. Retrieved from https://hal.univ-reims.fr/hal-02120029 Sun, J., Peterson, G. D., & Storaasli, O. (2007). Mapping Sparse Matrix-Vector Multiplication on FPGAs. Proceedings of the Third Annual Reconfigurable Systems Summer Institute (RSSI'07), July 17-20 (pp. 1-10). Urbana, Illinois, USA: RSSI. Retrieved from http://rssi.ncsa.illinois.edu/2007/proceedings/papers/rssi07_12_paper.pdf Townsend, K. R. (2016). Computing SpMV on FPGAs. PhD Thesis, Iowa State University, Electrical and Computer Engineering, Ames, Iowa. doi:10.31274/etd-180810-4826 Tsoi, K. H., & Luk, W. (2010). Axel: a heterogeneous cluster with FPGAs and GPUs. Proceedings of the 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA '10, Monterey, California, USA, February 21 - 23 (pp. 115–124). New York, NY, USA: Association for Computing Machinery. doi:10.1145/1723112.1723134 Zhang, Z., Fan, Y., Jiang, W., Han, G., Yang, C., & Jason, C. (2008). AutoPilot: A Platform-Based ESL Synthesis System. In P. Coussy, & A. Morawiec (Eds.), High-Level Synthesis: From Algorithm to Digital Circuit (pp. 99-112). Dordrecht, Netherlands: Springer. doi:10.1007/978-1-4020- 8588-8_6 Zhuo, L., & Prasanna, V. K. (2005). High Performance Linear Algebra Operations on Reconfigurable Systems. In SC '05: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, 12- 18 November (p. 2). Seattle, WA, USA: IEEE. doi:10.1109/SC.2005.31
dc.rights.coar.fl_str_mv	http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv	http://purl.org/coar/access_right/c_abf2
dc.format.mimetype.spa.fl_str_mv	application/pdf
dc.publisher.spa.fl_str_mv	Universidad Autónoma de Bucaramanga UNAB
dc.source.spa.fl_str_mv	Vol. 25 Núm. 2 (2024): Revista Colombiana de Computación (Julio-Diciembre); 43-58
institution	Universidad Autónoma de Bucaramanga - UNAB
bitstream.url.fl_str_mv	https://repository.unab.edu.co/bitstream/20.500.12749/28300/1/Articulo%205.pdf https://repository.unab.edu.co/bitstream/20.500.12749/28300/2/license.txt https://repository.unab.edu.co/bitstream/20.500.12749/28300/3/Articulo%205.pdf.jpg
bitstream.checksum.fl_str_mv	672c5de0081db2f9cdca0fcaf82999cd 855f7d18ea80f5df821f7004dff2f316 1968381a0286a1160700d26c6e3df0d3
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio Institucional \| Universidad Autónoma de Bucaramanga - UNAB
repository.mail.fl_str_mv	repositorio@unab.edu.co
_version_	1851051732930396160
spelling	Morales Peña, Alejandrof887aee6-6705-403a-b5d6-e0f0658678dfMeneses, Esteban4a20e5ac-8885-4884-8394-47d2c557f95fMorales Peña, Alejandro [0009-0004-9508-4266]Meneses, Esteban [0000-0002-4307-6000]2025-02-14T13:52:06Z2025-02-14T13:52:06Z2024-06-181657-28312539-2115http://hdl.handle.net/20.500.12749/28300instname:Universidad Autónoma de Bucaramanga UNABrepourl:https://repository.unab.edu.cohttps://doi.org/10.29375/25392115.5276Aunque los sistemas heterogéneos basados en aceleradores hardware son un tópico de tendencia en la comunidad HPC, explorar las ventajas y desventajas de los basados en hardware reconfigurable en librerías de álgebra lineal para sistemas de alto rendimiento, no ha sido estudiado en profundidad. Por ello, en esta investigación, nuestro objetivo es aprovechar la capacidad de reconfiguración, adaptabilidad y reducción del consumo de energía de las FPGAs para generar kernels basados en FPGAs en Ginkgo, una librería especializada de álgebra lineal de alto rendimiento para sistemas multinúcleo. Generamos 3 kernels basados en FPGA para los formatos CSR, SELLP y SELL SpMV, y obtuvimos aumentos de velocidad de al menos 10 veces respecto a los kernels basados en CPU. Además, demostramos mediante un estudio de caracterización del rendimiento que las FPGA superan a los procesadores de propósito general en términos de tiempo de cálculo.Although heterogeneous systems based on hardware accelerators are a trending topic in the HPC community, exploring the trade-offs of reconfigurable hardware-based ones in linear algebra libraries for high-performance systems, has not been deeply studied. Therefore, in this research, we aim to take advantage of FPGAs' reconfigurability, adaptability, and capacity to reduce power consumption to generate FPGA-based kernels in Ginkgo, a specialized high-performance linear algebra library for many-core systems. We generated 3 FPGA-based kernels for the CSR, SELLP, and SELL SpMV formats, and obtained speedups of at least 10x concerning CPU-based kernels. Furthermore, we demonstrated via a performance characterization study that FPGAs outperform general-purpose processors in terms of compute time.application/pdfspaUniversidad Autónoma de Bucaramanga UNABhttps://revistas.unab.edu.co/index.php/rcc/article/view/5276/4086https://revistas.unab.edu.co/index.php/rcc/issue/view/303AMD. (2022a, August 4). Heterogeneous Accelerated Compute Cluster (HACC) Program. (Advanced Micro Devices, Inc) Retrieved 2023, from AMD Website: https://www.amd.com/en/corporate/university-program/aup-hacc.htmlAMD. (2022b, October 7). XRT Native APIs. (Advanced Micro Devices, Inc) Retrieved 2023, from https://xilinx.github.io/XRT/master/html/xrt_native_apis.htmlAMD. (2023). ROCm Software 5.3.0: HIP Documentation. (Advanced Micro Devices, Inc) Retrieved 2024, from AMD website: https://rocm.docs.amd.com/projects/HIP/en/docs-5.3.0/index.htmlAMD. (2024, May 15). AMD. (Advanced Micro Devices, Inc) Retrieved 2024, from AMD Website: https://www.amd.com/en.htmlAnderson, E., Bai, Z., Bischof, C., Blackford, L. S., Demmel, J., Dongarra, J., . . . Sorensen, D. (1999). LAPACK Users' Guide (Third ed.). Philadelphia, USA: SIAM. doi:10.1137/1.9780898719604Anzt, H., Cojean, T., Chen, Y.-C., Flegar, G., Göbel, F., Grützmacher, T., . . . Tsai, Y.-H. (2020). Ginkgo: A high performance numerical linear algebra library. Journal of Open Source Software, 5(52), 1-6, 2260. doi:10.21105/joss.02260Anzt, H., Cojean, T., Flegar, G., Göbel, F., Grützmacher, T., Nayak, P., . . . Quintana-Ortí, E. S. (2022, March). Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing. (Z. Bai, & W. Bangerth, Eds.) ACM Transactions on Mathematical Software (TOMS), 48(1), 1-33, Article No. 2. doi:10.1145/3480935Anzt, H., Tomov, S., & Dongarra, J. (2014, April). Implementing a Sparse Matrix Vector Product for the SELL-C/SELL-C-formats on NVIDIA GPUs. Technical Report UT-EECS-14-727, University of Tennessee. Retrieved from https://icl.utk.edu/files/publications/2014/icl-utk-772- 2014.pdfBosch, J., Tan, X., Filgueras, A., Vidal, M., Mateu, M., Jiménez-González, D., . . . Labarta, J. (2018). Application Acceleration on FPGAs with OmpSs@FPGA. In 2018 International Conference on Field-Programmable Technology (FPT), Naha, Japan, 10-14 Dec. (pp. 70-77). IEEE. doi:10.1109/FPT.2018.00021BSC. (2016). Linear Algebra and Math Libraries. (Barcelona Supercomputing Center) Retrieved 2023, from BSC website: https://www.bsc.es/research-development/research-areas/programmingmodels/ linear-algebra-and-math-librariesCppreference. (2024, October 4). RAII. Retrieved from Cppreference website: https://en.cppreference.com/w/cpp/language/raiiDavis, T. A., & Hu, Y. (2011, November). The university of Florida sparse matrix collection. ACM Transactions on Mathematical Software, 38(1), 1-25, Article 1. doi:10.1145/2049662.2049663De Matteis, T., de Fine Licht, J., & Hoefler, T. (2020). fBLAS: streaming linear algebra on FPGA. SC '20: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, November 9 - 19 (pp. 1-13, Article 59). Atlanta, Georgia, USA: IEEE. doi:10.5555/3433701.3433779Dongarra, J. J., & Walker, D. W. (1995). Software Libraries for Linear Algebra Computations on High Performance Computers. SIAM Review, 37(2), 151-180. doi:10.1137/1037042Dongarra, J., & Blackford, L. S. (1996). ScaLAPACK tutorial. In J. Waśniewski, J. Dongarra, K. Madsen, & D. Olesen, Applied Parallel Computing Industrial Computation and Optimization. Third International Workshop, PARA 1996, Lyngby, Denmark, August 18-21. Lecture Notes in Computer Science (Vol. 1184, pp. 204–215). Berlin, Heidelberg, Germany: Springer. doi:10.1007/3-540-62095-8_22ETH Zürich. (2024). ETH Zürich. Retrieved from ETH website: https://ethz.ch/de.htmlFang, J., Mulder, Y. B., Hidders, J., Lee, J., & Hofstee, H. P. (2020, January). In-memory database acceleration on FPGAs: a survey. The VLDB Journal, 29(1), 33–59. doi:10.1007/s00778-019- 00581-wGao, Y., & Zhang, P. (2016). A Survey of Homogeneous and Heterogeneous System Architectures in High Performance Computing. 2016 IEEE International Conference on Smart Cloud (SmartCloud), 8-20 Nov. (pp. 170-175). New York, NY, USA: IEEE. doi:10.1109/SmartCloud.2016.36Girden, E. R. (1992). ANOVA: repeated measures. Newbury Park, CA, USA: Sage, University Paper Serires on Quantitativer Aplications in the Social Sciences, Series 07-084. doi:10.4135/9781412983419Gonzalez, J., & Núñez, R. C. (2009, July 1). LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators. Journal of Physics: Conference Series, SciDAC 2009, 14–18 June, 180(1, 012042). doi:10.1088/1742-6596/180/1/012042Kestur, S., Davis, J. D., & Chung, E. S. (2012). Towards a Universal FPGA Matrix-Vector Multiplication Architecture. 2012 IEEE 20th International Symposium on Field- Programmable Custom Computing Machines, 29 April - 1 May (pp. 9-16). Toronto, ON, Canada: IEEE. doi:10.1109/FCCM.2012.12Khronos Group. (2024). OPenCL: Open Standard for Parallel Programming of Heterogeneous Systems. Retrieved from Khronos Group website: https://www.khronos.org/opencl/Khronos Group. (2024). SYCL: C++ Programming for Heterogeneous Parallel Computing. Retrieved from Khronos® Group website: https://www.khronos.org/api/index_2017/syclKuon, I., Tessier, R., & Rose, J. (2008). FPGA Architecture: Survey and Challenges. Foundations and Trends in Electronic Design Automation, 2(2), 135-253. doi:10.1561/1000000005Lawson, C. L., Hanson, R. J., Kincaid, D. R., & Krogh, F. T. (1979, September). Basic Linear Algebra Subprograms for Fortran Usage. (J. R. Rice, Ed.) ACM Transactions on Mathematical Software (TOMS), 5(3), 308–323. doi:10.1145/355841.355847NVIDIA Corporation. (2024). CUDA Toolkit. Retrieved from NVIDIA Developer website: https://developer.nvidia.com/cuda-toolkitOpenMP. (2024). OpenMP: The OpenMP API specification for parallel programming. Retrieved from OpenMP website: https://www.openmp.org/Podobas, A. (2014). Accelerating Parallel Computations with OpenMP-Driven System-on-Chip Generation for FPGAs. Proceedings of the 2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs, MCSOC '14, September 23 - 25 (pp. 149-156). Washington, DC, USA: IEEE. doi:10.1109/MCSoC.2014.30Sommer, L., Korinth, J., & Koch, A. (2017). OpenMP device offloading to FPGA accelerators. 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP), 10-12 July (pp. 201-205). Seattle, WA, USA: IEEE. doi:10.1109/ASAP.2017.7995280Steffenel, L. A. (2019). HPC challenges for the next years: the rising of heterogeneity and its impact on simulations. CECAM Workshop: Microscopic simulations: forecasting the next two decades, April 24-26 (pp. 1-25). Toulouse, France: CECAM - Centre Européen de Calcul Atomique et Moléculaire. Retrieved from https://hal.univ-reims.fr/hal-02120029Sun, J., Peterson, G. D., & Storaasli, O. (2007). Mapping Sparse Matrix-Vector Multiplication on FPGAs. Proceedings of the Third Annual Reconfigurable Systems Summer Institute (RSSI'07), July 17-20 (pp. 1-10). Urbana, Illinois, USA: RSSI. Retrieved from http://rssi.ncsa.illinois.edu/2007/proceedings/papers/rssi07_12_paper.pdfTownsend, K. R. (2016). Computing SpMV on FPGAs. PhD Thesis, Iowa State University, Electrical and Computer Engineering, Ames, Iowa. doi:10.31274/etd-180810-4826Tsoi, K. H., & Luk, W. (2010). Axel: a heterogeneous cluster with FPGAs and GPUs. Proceedings of the 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA '10, Monterey, California, USA, February 21 - 23 (pp. 115–124). New York, NY, USA: Association for Computing Machinery. doi:10.1145/1723112.1723134Zhang, Z., Fan, Y., Jiang, W., Han, G., Yang, C., & Jason, C. (2008). AutoPilot: A Platform-Based ESL Synthesis System. In P. Coussy, & A. Morawiec (Eds.), High-Level Synthesis: From Algorithm to Digital Circuit (pp. 99-112). Dordrecht, Netherlands: Springer. doi:10.1007/978-1-4020- 8588-8_6Zhuo, L., & Prasanna, V. K. (2005). High Performance Linear Algebra Operations on Reconfigurable Systems. In SC '05: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, 12- 18 November (p. 2). Seattle, WA, USA: IEEE. doi:10.1109/SC.2005.31Vol. 25 Núm. 2 (2024): Revista Colombiana de Computación (Julio-Diciembre); 43-58Computación de Alto RendimientoGinkgoFPGAsSpMVHPCGinkgoFPGAsSpMVExpansión de Ginkgo para administrar kernels reconfigurables basados en hardwareExtending Ginkgo to Manage Reconfigurable Hardware-Based Kernelsinfo:eu-repo/semantics/articleArtículohttp://purl.org/coar/resource_type/c_2df8fbb1http://purl.org/redcol/resource_type/ARThttp://purl.org/coar/version/c_970fb48d4fbd8a85http://purl.org/coar/access_right/c_abf2ORIGINALArticulo 5.pdfArticulo 5.pdfArtículoapplication/pdf879859https://repository.unab.edu.co/bitstream/20.500.12749/28300/1/Articulo%205.pdf672c5de0081db2f9cdca0fcaf82999cdMD51open accessLICENSElicense.txtlicense.txttext/plain; charset=utf-8347https://repository.unab.edu.co/bitstream/20.500.12749/28300/2/license.txt855f7d18ea80f5df821f7004dff2f316MD52open accessTHUMBNAILArticulo 5.pdf.jpgArticulo 5.pdf.jpgIM Thumbnailimage/jpeg9727https://repository.unab.edu.co/bitstream/20.500.12749/28300/3/Articulo%205.pdf.jpg1968381a0286a1160700d26c6e3df0d3MD53open access20.500.12749/28300oai:repository.unab.edu.co:20.500.12749/283002025-02-14 22:00:30.937open accessRepositorio Institucional \| Universidad Autónoma de Bucaramanga - UNABrepositorio@unab.edu.coTGEgUmV2aXN0YSBDb2xvbWJpYW5hIGRlIENvbXB1dGFjacOzbiBlcyBmaW5hbmNpYWRhIHBvciBsYSBVbml2ZXJzaWRhZCBBdXTDs25vbWEgZGUgQnVjYXJhbWFuZ2EuIEVzdGEgUmV2aXN0YSBubyBjb2JyYSB0YXNhIGRlIHN1bWlzacOzbiB5IHB1YmxpY2FjacOzbiBkZSBhcnTDrWN1bG9zLiBQcm92ZWUgYWNjZXNvIGxpYnJlIGlubWVkaWF0byBhIHN1IGNvbnRlbmlkbyBiYWpvIGVsIHByaW5jaXBpbyBkZSBxdWUgaGFjZXIgZGlzcG9uaWJsZSBncmF0dWl0YW1lbnRlIGludmVzdGlnYWNpw7NuIGFsIHDDumJsaWNvIGFwb3lhIGEgdW4gbWF5b3IgaW50ZXJjYW1iaW8gZGUgY29ub2NpbWllbnRvIGdsb2JhbC4=

Expansión de Ginkgo para administrar kernels reconfigurables basados en hardware

Publicaciones similares