Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
60 result(s) for "Grigori, Laura"
Sort by:
Communication-optimal Parallel and Sequential QR and LU Factorizations
We present parallel and sequential dense QR factorization algorithms that are both optimal (up to polylogarithmic factors) in the amount of communication they perform and just as stable as Householder QR. We prove optimality by deriving new lower bounds for the number of multiplications done by \"non-Strassen-like\" QR, and using these in known communication lower bounds that are proportional to the number of multiplications. We not only show that our QR algorithms attain these lower bounds (up to polylogarithmic factors), but that existing LAPACK and ScaLAPACK algorithms perform asymptotically more communication. We derive analogous communication lower bounds for LU factorization and point out recent LU algorithms in the literature that attain at least some of these lower bounds. The sequential and parallel QR algorithms for tall and skinny matrices lead to significant speedups in practice over some of the existing algorithms, including LAPACK and ScaLAPACK, for example, up to 6.7 times over ScaLAPACK. A performance model for the parallel algorithm for general rectangular matrices predicts significant speedups over ScaLAPACK.
Adaptive linear solution process for single-phase Darcy flow
This article presents an adaptive approach for solving linear systems arising from self-adjoint Partial Differential Equations (PDE) problems discretized by cell-centered finite volume method and stemming from single-phase flow simulations. This approach aims at reducing the algebraic error in targeted parts of the domain using a posteriori error estimates. Numerical results of a reservoir simulation example for heterogeneous porous media in two dimensions are discussed. Using the adaptive solve procedure, we obtain a significant gain in terms of the number of time steps and iterations compared to a standard solve.
S-Step BiCGStab Algorithms for Geoscience Dynamic Simulations
In basin and reservoir simulations, the most expensive and time consuming phase is solving systems of linear equations using Krylov subspace methods such as BiCGStab. For this reason, we explore the possibility of using communication avoiding Krylov subspace methods (s-step BiCGStab), that speedup of the convergence time on modern-day architectures, by restructuring the algorithms to reduce communication. We introduce some variants of s-step BiCGStab with better numerical stability for the targeted systems. Dans les simulateurs d’écoulement en milieu poreux, comme les simulateurs de réservoir et de bassin, la résolution de système linéaire constitue l’étape la plus consommatrice en temps de calcul et peut même représenter jusqu’à 80 % du temps de la simulation. Ceci montre que la performance de ces simulateurs dépend fortement de l’efficacité des solveurs linéaires. En même temps, les machines parallèles modernes disposent d’un grand nombre de processeurs et d’unités de calcul massivement parallèle. Dans cet article, nous proposons de nouveaux algorithmes BiCGStab, basés sur l’algorithme à moindre communication nommé s-step, permettant d’éviter un certain nombre de communication afin d’exploiter pleinement les architectures hautement parallèles.
A Class of Efficient Locally Constructed Preconditioners Based on Coarse Spaces
In this paper we present a class of robust and fully algebraic two-level preconditioners for symmetric positive definite (SPD) matrices. We introduce the notion of algebraic local symmetric positive semidefinite splitting of an SPD matrix and we give a characterization of this splitting. This splitting leads to construct algebraically and locally a class of efficient coarse spaces which bound the spectral condition number of the preconditioned system by a number defined a priori. We also introduce the tau-filtering subspace. This concept helps compare the dimension minimality of coarse spaces. Some PDEs-dependant preconditioners correspond to a special case. The examples of the algebraic coarse spaces in this paper are not practical due to expensive construction. We propose a heuristic approximation that is not costly. Numerical experiments illustrate the efficiency of the proposed method.
Numerical algorithms for high-performance computational science
A number of features of today’s high-performance computers make it challenging to exploit these machines fully for computational science. These include increasing core counts but stagnant clock frequencies; the high cost of data movement; use of accelerators (GPUs, FPGAs, coprocessors), making architectures increasingly heterogeneous; and multi- ple precisions of floating-point arithmetic, including half-precision. Moreover, as well as maximizing speed and accuracy, minimizing energy consumption is an important criterion. New generations of algorithms are needed to tackle these challenges. We discuss some approaches that we can take to develop numerical algorithms for high-performance computational science, with a view to exploiting the next generation of supercomputers. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.
Numerical algorithms for high-performance computational science
A number of features of today’s high-performance computers make it challenging to exploit these machines fully for computational science. These include increasing core counts but stagnant clock frequencies; the high cost of data movement; use of accelerators (GPUs, FPGAs, coprocessors), making architectures increasingly heterogeneous; and multiple precisions of floating-point arithmetic, including half-precision. Moreover, as well as maximizing speed and accuracy, minimizing energy consumption is an important criterion. New generations of algorithms are needed to tackle these challenges. We discuss some approaches that we can take to develop numerical algorithms for high-performance computational science, with a view to exploiting the next generation of supercomputers. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.
Enlarged Krylov Subspace Conjugate Gradient Methods for Reducing Communication
In this paper we introduce a new approach for reducing communication in Krylov subspace methods that consists of enlarging the Krylov subspace by a maximum of $t$ vectors per iteration, based on a domain decomposition of the graph of $A$. The obtained enlarged Krylov subspace $\\mathscr{K}_{k,t}(A,r_0)$ is a superset of the Krylov subspace $\\mathcal{K}_k(A,r_0)$, $\\mathcal{K}_k(A,r_0) \\subset \\mathscr{K}_{k,t}(A,r_0)$. Thus, we search for the solution of the system $Ax=b$ in $\\mathscr{K}_{k,t}(A,r_0)$ instead of $\\mathcal{K}_k(A,r_0)$. Moreover, we show in this paper that the enlarged Krylov projection subspace methods lead to faster convergence in terms of iterations and parallelizable algorithms with less communication, with respect to Krylov methods.
Interpretation of parareal as a two-level additive Schwarz in time preconditioner and its acceleration with GMRES
We describe an interpretation of parareal as a two-level additive Schwarz preconditioner in the time domain. We show that this two-level preconditioner in time is equivalent to parareal and to multigrid reduction in time (MGRIT) with F-relaxation. We also discuss the case when additional fine or coarse propagation steps are applied in the preconditioner. This leads to procedures equivalent to MGRIT with FCF-relaxation and to MGRIT with F(CF) 2 -relaxation or overlapping parareal. Numerical results show that these variants have faster convergence in some cases. In addition, we also apply a Krylov subspace method, namely GMRES (generalized minimal residual), to accelerate the parareal algorithm. Better convergence is obtained, especially for the advection-reaction-diffusion equation in the case when advection and reaction coefficients are large.