Loop profiling tool for hpc code inspection as an efficient method of FPGA based acceleration
Marcin Pietroń ; Paweł Russek ; Kazimierz Wiatr
International Journal of Applied Mathematics and Computer Science, Tome 20 (2010), p. 581-589 / Harvested from The Polish Digital Mathematics Library

This paper presents research on FPGA based acceleration of HPC applications. The most important goal is to extract a code that can be sped up. A major drawback is the lack of a tool which could do it. HPC applications usually consist of a huge amount of a complex source code. This is one of the reasons why the process of acceleration should be as automated as possible. Another reason is to make use of HLLs (High Level Languages) such as Mitrion-C (Mohl, 2006). HLLs were invented to make the development of HPRC applications faster. Loop profiling is one of the steps to check if the insertion of an HLL to an existing HPC source code is possible to gain acceleration of these applications. Hence the most important step to achieve acceleration is to extract the most time consuming code and data dependency, which makes the code easier to be pipelined and parallelized. Data dependency also gives information on how to implement algorithms in an FPGA circuit with minimal initialization of it during the execution of algorithms.

Publié le : 2010-01-01
EUDML-ID : urn:eudml:doc:208009
@article{bwmeta1.element.bwnjournal-article-amcv20i3p581bwm,
     author = {Marcin Pietro\'n and Pawe\l\ Russek and Kazimierz Wiatr},
     title = {Loop profiling tool for hpc code inspection as an efficient method of FPGA based acceleration},
     journal = {International Journal of Applied Mathematics and Computer Science},
     volume = {20},
     year = {2010},
     pages = {581-589},
     language = {en},
     url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-amcv20i3p581bwm}
}
Marcin Pietroń; Paweł Russek; Kazimierz Wiatr. Loop profiling tool for hpc code inspection as an efficient method of FPGA based acceleration. International Journal of Applied Mathematics and Computer Science, Tome 20 (2010) pp. 581-589. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-amcv20i3p581bwm/

[000] Bennett, D., Dellinger, E., Mason, J. and Sundarajan, P. (2006). An FPGA-oriented target language for HLL compilation, Reconfigurable Systems Summer Institute, RSSI 2006, Urbana, IL, USA.

[001] Deng, L., Kim, J.S., Mangalagiri, P., Irick, K., Sobti, K., Kandemir, M., Narayanan, V., Chakrabarti, Ch., Pitsianis, N. and Sun, X. (2009). An automated framework for accelerating numerical algorithms on reconfigurable platform using algorithmic/architectural optimization, IEEE Transactions on Computers 58(12): 1654-1667.

[002] Gasper, P., Herbst, C., McCough, J., Rickett, C. and Stubbendieck, G. (2003). Automatic parallelization of sequential C code, Midwest Instruction and Computing Symposium, Duluth, MN, USA.

[003] Gong, W.,Wang, G. and Kastner, R. (2004). A high performance application representation for reconfigurable systems, Conference on Engineering of Reconfigurable Systems and Algorithms, ERSA, Las Vegas, NV, USA.

[004] Kindratenko, V., Brunner, R. and Myers, A. (2007). Mitrion-C application development on SGI Altix 350/RC100, International Symposium on Field Programmable Custom Computing Machines, FCCM 2007, pp. 239-250.

[005] Kindratenko, V., Myers, A. and Brunner, R. (2006). Exploring coarse- and fine-grain parallelism on a high-performance reconfigurable computer, 2nd Annual Reconfigurable Systems Summer Institute, RSSI 2006, Napa Valley, CA, USA.

[006] Liu, K., Cameron, Ch. and Sarkady, A. (2008). Using MitrionC to implement floating-point arithmetic on a Cray XD1 supercomputer, DoD HPCMP Users Group Conference, HPCMP-UGC, Urbana, IL, USA, pp. 391-395.

[007] Memik, S.O., Bozorgzadeh, G., Kastner, R. and Sarrafzadeh, M. (2005). A scheduling algorithm for optimization and planning in high-level synthesis, ACM Transactions on Design Automation of Electronic Systems 10(1).

[008] Messmer, P. and Bodenner R. (2006). Accelerating scentific applications using FPGAs, XCell Journal 10(1): 33-57.

[009] Mohl, S. (2006). The Mitrion-C programming language, Mitrionics Inc., Second Quarter, pp. 70-73, http://www.mitrion.com.

[010] Moseley, T., Grunwald, D., Connors, A., Ramanujam, R., Tovinkere, V. and Peri R. (2006). LoopProf: Dynamic techniques for loop detection and profiling, Proceedings of the 2006 Workshop on Binary Instrumentation and Applications, WBIA, Lund, Sweden.

[011] Pietroń, M., Wiatr, K. and Russek, P. (2007(a)). Methodology of computing acceleration using reconfigurable logic technology in high performance computing, University of Science and Technology in Cracow Automatica, 2007, pp. 149-156.

[012] Pietroń, M., Russek, P., Wiatr, K., Jamro, E. and Wielgosz, M. (2007(b)). Two electron integrals calculation accelerated with double precision exp() hardware module, Reconfigurable Systems Summer Institute, RSSI, Urbana, IL, USA.

[013] Russek, P. and Wiatr, K. (2006). The prospect of computing acceleration using reconfigurable logic technology in huge computational power systems, Proceedings of the IFAC Workshop on Programable Devices and Embedded Systems, PDeS 2006, Brno, Czech Republic, pp. 44-49. | Zbl 1173.68523