QR Factorization of Tall and Skinny Matrices in a Grid Computing EnvironmentReport as inadecuate

QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment - Download this document for free, or read online. Document in PDF available to download.

1 ICL - Innovative Computing Laboratory Knoxville 2 HiePACS - High-End Parallel Algorithms for Challenging Numerical Simulations LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest 3 LaBRI - Laboratoire Bordelais de Recherche en Informatique 4 LRI - Laboratoire de Recherche en Informatique 5 GRAND-LARGE - Global parallel and distributed computing LRI - Laboratoire de Recherche en Informatique, LIFL - Laboratoire d-Informatique Fondamentale de Lille, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623 6 Department of Mathematical and Statistical Sciences

Abstract : To exploit the potential of multicore architectures, recent dense linear algebra libraries have used tile algorithms, which consist in scheduling a Directed Acyclic Graph DAG of tasks of fine granularity where nodes represent tasks, either panel factorization or update of a block-column, and edges represent dependencies among them. Although past approaches already achieve high performance on moderate and large square matrices, their way of processing a panel in sequence leads to limited performance when factorizing tall and skinny matrices or small square matrices. We present a new fully asynchronous method for computing a QR factorization on shared-memory multicore architectures that overcomes this bottleneck. Our contribution is to adapt an existing algorithm that performs a panel factorization in parallel named Communication-Avoiding QR and initially designed for distributed-memory machines, to the context of tile algorithms using asynchronous computations. An experimental study shows significant improvement up to almost 10 times faster compared to state-of-the-art approaches. We aim to eventually incorporate this work into the Parallel Linear Algebra for Scalable Multi-core Architectures PLASMA library.

Author: Emmanuel Agullo - Camille Coti - Jack Dongarra - Thomas Herault - Julien Langou -

Source: https://hal.archives-ouvertes.fr/


Related documents