Catalogue Search | MBRL

On Computing Inverse Entries of a Sparse Matrix in an Out-of-Core Environment

by Uçar, Bora , Duff, Iain S. , Robert, Yves in Algorithms , Approximation , Computation

2012

The inverse of an irreducible sparse matrix is structurally full, so that it is impractical to think of computing or storing it. However, there are several applications where a subset of the entries of the inverse is required. Given a factorization of the sparse matrix held in out-of-core storage, we show how to compute such a subset efficiently, by accessing only parts of the factors. When there are many inverse entries to compute, we need to guarantee that the overall computation scheme has reasonable memory requirements, while minimizing the volume of communication (data transferred) between disk and main memory. This leads to a partitioning problem that we prove is NP-complete. We also show that we cannot get a close approximation to the optimal solution in polynomial time. We thus need to develop heuristic algorithms, and we propose (i) a lower bound on the cost of an optimum solution; (ii) an exact algorithm for a particular case; (iii) two other heuristics for a more general case; and (iv) hypergraph partitioning models for the most general setting. We compare the proposed algorithms and illustrate the performance of our algorithms in practice using the \\textttMUMPS software package on a set of real-life problems. [PUBLICATION ABSTRACT]

Journal Article

Share this book

Add to My Shelf

Implementation of fixed-nuclei polyatomic MCTDHF capability and the future with nuclear motion

by Haxton, Daniel J , Vecharynski, Eugene , Rescigno, Thomas N in Absorption cross sections , Basis functions , Cartesian coordinates

2015

Synopsis We discuss the implementation (https://commons.lbl.gov/display/csd/LBNL-AMO-MCTDHF) of Multiconfiguration Time-Dependent Hartree-Fock for polyatomic molecules using a Cartesian product grid of sinc basis functions, and present absorption cross sections and other results calculated with it.

Journal Article

Share this book

Add to My Shelf

An efficient basis set representation for calculating electrons in molecules

by Haxton, Daniel J , Vecharynski, Eugene , Rescigno, Thomas N in Absorption cross sections , Basis functions , Electrons

2016

The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.

Paper

Share this book

Add to My Shelf

Matrix-free construction of HSS representation using adaptive randomized sampling

by Gorman, Christopher , Ghysels, Pieter , Chávez, Gustavo in Adaptive algorithms , Adaptive sampling , Algorithms

2018

We present new algorithms for the randomized construction of hierarchically semi-separable matrices, addressing several practical issues. The HSS construction algorithms use a partially matrix-free, adaptive randomized projection scheme to determine the maximum off-diagonal block rank. We develop both relative and absolute stopping criteria to determine the minimum dimension of the random projection matrix that is sufficient for the desired accuracy. Two strategies are discussed to adaptively enlarge the random sample matrix: repeated doubling of the number of random vectors, and iteratively incrementing the number of random vectors by a fixed number. The relative and absolute stopping criteria are based on probabilistic bounds for the Frobenius norm of the random projection of the Hankel blocks of the input matrix. We discuss parallel implementation and computation and communication cost of both variants. Parallel numerical results for a range of applications, including boundary element method matrices and quantum chemistry Toeplitz matrices, show the effectiveness, scalability and numerical robustness of the proposed algorithms.

Paper

Share this book

Add to My Shelf

Low-Rank Kernel Matrix Approximation Using Skeletonized Interpolation With Endo- or Exo-Vertices

by Ashcraft, Cleve , Xu, Zixi , L'Eplatennier, Pierre in Algorithms , Apexes , Chebyshev approximation

2018

The efficient compression of kernel matrices, for instance the off-diagonal blocks of discretized integral equations, is a crucial step in many algorithms. In this paper, we study the application of Skeletonized Interpolation to construct such factorizations. In particular, we study four different strategies for selecting the initial candidate pivots of the algorithm: Chebyshev grids, points on a sphere, maximally-dispersed and random vertices. Among them, the first two introduce new interpolation points (exo-vertices) while the last two are subsets of the given clusters (endo- vertices). We perform experiments using three real-world problems coming from the multiphysics code LS-DYNA. The pivot selection strategies are compared in term of quality (final rank) and efficiency (size of the initial grid). These benchmarks demonstrate that overall, maximally-dispersed vertices provide an accurate and efficient sets of pivots for most applications. It allows to reach near-optimal ranks while starting with relatively small sets of vertices, compared to other strategies.

Paper

Share this book

Add to My Shelf

A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization

by Ghysels, Pieter , Napov, Artem , Li, Xiaoye S in Adaptive sampling , Algorithms , Boundary element method

2015

We present a distributed-memory library for computations with dense structured matrices. A matrix is considered structured if its off-diagonal blocks can be approximated by a rank-deficient matrix with low numerical rank. Here, we use Hierarchically Semi-Separable representations (HSS). Such matrices appear in many applications, e.g., finite element methods, boundary element methods, etc. Exploiting this structure allows for fast solution of linear systems and/or fast computation of matrix-vector products, which are the two main building blocks of matrix computations. The compression algorithm that we use, that computes the HSS form of an input dense matrix, relies on randomized sampling with a novel adaptive sampling mechanism. We discuss the parallelization of this algorithm and also present the parallelization of structured matrix-vector product, structured factorization and solution routines. The efficiency of the approach is demonstrated on large problems from different academic and industrial applications, on up to 8,000 cores. This work is part of a more global effort, the STRUMPACK (STRUctured Matrices PACKage) software package for computations with sparse and dense structured matrices. Hence, although useful on their own right, the routines also represent a step in the direction of a distributed-memory sparse solver.

Paper

Share this book

Add to My Shelf

An efficient hybrid tridiagonal divide-and-conquer algorithm on distributed memory architectures

by Li, Shengguo , Liu, Jie , Chi, Xuebin in Algorithms , Computer architecture , Computing costs

2016

In this paper, an efficient divide-and-conquer (DC) algorithm is proposed for the symmetric tridiagonal matrices based on ScaLAPACK and the hierarchically semiseparable (HSS) matrices. HSS is an important type of rank-structured matrices.Most time of the DC algorithm is cost by computing the eigenvectors via the matrix-matrix multiplications (MMM). In our parallel hybrid DC (PHDC) algorithm, MMM is accelerated by using the HSS matrix techniques when the intermediate matrix is large. All the HSS algorithms are done via the package STRUMPACK. PHDC has been tested by using many different matrices. Compared with the DC implementation in MKL, PHDC can be faster for some matrices with few deflations when using hundreds of processes. However, the gains decrease as the number of processes increases. The comparisons of PHDC with ELPA (the Eigenvalue soLvers for Petascale Applications library) are similar. PHDC is usually slower than MKL and ELPA when using 300 or more processes on Tianhe-2 supercomputer.

Paper

Share this book

Add to My Shelf

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

by Ghysels, Pieter , Napov, Artem , Williams, Samuel in Algorithms , Distributed memory , Gaussian elimination

2015

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter