en fr Data-reuse Optimizations for Pipelined Tiling with Parametric Tile Sizes Optimisation de réutilisation des données pour le tuilage pipeliné avec tailles de tuiles paramétriques Report as inadecuate




en fr Data-reuse Optimizations for Pipelined Tiling with Parametric Tile Sizes Optimisation de réutilisation des données pour le tuilage pipeliné avec tailles de tuiles paramétriques - Download this document for free, or read online. Document in PDF available to download.

1 COMPSYS - Compilation and embedded computing systems Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l-Informatique du Parallélisme

Abstract : Loop tiling is a loop transformation widely used to improve spatial and temporal data locality, to increase computation granularity, and to enable blocking algorithms, which are particularly useful when offloading kernels on platforms with small memories. When hardware caches are not available, data transfers and local storage must be software-managed, and some useless external communications can be avoided by exploiting data reuse between tiles. An important parameter of loop tiling is the sizes of the tiles, which impact the size of the required local memory. However, for most analyses involving several tiles, which is the case for inter-tile data reuse, the tile sizes induce non-linear constraints, unless they are numerical constants. This complicates or prevents a parametric analysis with polyhedral optimization techniques. This extended abstract shows that, when tiles are executed in sequence along tile axes, the parametric with respect to tile sizes analysis for inter-tile data reuse is nevertheless possible, i.e., one can determine, at compile-time and in a parametric fashion, the copy-in and copy-out data sets for all tiles, with inter-tile reuse, as well as sizes for the induced local memories. Combined with hierarchical tiling, this result opens new perspectives for the automatic generation, guided by parametric cost models, of blocking algorithms, where blocks can be pipelined and-or can contain parallelism. Previous work on FPGAs and GPUs already showed the interest and feasibility of such automation with tiling, but in a non-parametric fashion.

Keywords : parametric tiling parallel programming optimization gpu cost models data-reuse fpga compilers code transformation accelerators code generation polyhedral analysis pipelining





Author: Alexandre Isoard -

Source: https://hal.archives-ouvertes.fr/



DOWNLOAD PDF




Related documents