Inverse Preconditioners for Sparse Matrices using CUDA and GPGPU

S. Xu, W. Xue, H.X. Lin

    Research output: Contribution to journalArticleScientificpeer-review

    Abstract

    The problem of numerical solution of sparse matrix-based linear systems arises from many scientific applications. Iterative solvers and corresponding preconditioning techniques are usually adopted. Due to the the irregularity in memory access patterns and undeterministic branching behavior of sparse matrix-based algorithms, they present unique problems for emerging GPU-based platforms.

    This paper discusses the techniques associated with applying GP-GPU and CUDA to the generation of right-looking approximate inverse preconditioners (AINV) and preconditioned GMRES based on them. Parallel algorithms are designed for generating AINV preconditioners with predefined sparsity pattern or dynamic fill-in control themes. Detailed complexity analysis is provided for dynamic fill-in control in preconditioners. Shared Memory-based optimization yield 200% speedup in the sparse vector inner products and sparse vector updates; while overall speedup compared with parallel CPU-based implementations are 3 times and 6 times, respectively. The preconditioned GMRES iteration based on AINV witnessed 7 to 9 times speedup, due to the high efficiency of matrix-vector product based preconditioning. The developed techniques can also be applied to other Krylov solvers and preconditioners.
    Original languageEnglish
    Pages (from-to)475-500
    Number of pages26
    JournalJournal of Algorithms and Computational Technology
    Volume5
    Issue number3
    Publication statusPublished - 2011

    Fingerprint

    GPGPU
    Sparse matrix
    Preconditioner
    Approximate Inverse
    Data storage equipment
    Speedup
    GMRES
    Cross product
    Parallel algorithms
    Program processors
    Linear systems
    Preconditioning Techniques
    Iterative Solvers
    Complexity Analysis
    Matrix Product
    Irregularity
    Preconditioning
    Shared Memory
    Sparsity
    Scalar, inner or dot product

    Cite this

    @article{5d3adb148a6a432db6e99197a53bf741,
    title = "Inverse Preconditioners for Sparse Matrices using CUDA and GPGPU",
    abstract = "The problem of numerical solution of sparse matrix-based linear systems arises from many scientific applications. Iterative solvers and corresponding preconditioning techniques are usually adopted. Due to the the irregularity in memory access patterns and undeterministic branching behavior of sparse matrix-based algorithms, they present unique problems for emerging GPU-based platforms.This paper discusses the techniques associated with applying GP-GPU and CUDA to the generation of right-looking approximate inverse preconditioners (AINV) and preconditioned GMRES based on them. Parallel algorithms are designed for generating AINV preconditioners with predefined sparsity pattern or dynamic fill-in control themes. Detailed complexity analysis is provided for dynamic fill-in control in preconditioners. Shared Memory-based optimization yield 200{\%} speedup in the sparse vector inner products and sparse vector updates; while overall speedup compared with parallel CPU-based implementations are 3 times and 6 times, respectively. The preconditioned GMRES iteration based on AINV witnessed 7 to 9 times speedup, due to the high efficiency of matrix-vector product based preconditioning. The developed techniques can also be applied to other Krylov solvers and preconditioners.",
    author = "S. Xu and W. Xue and H.X. Lin",
    note = "Pagination: 26",
    year = "2011",
    language = "English",
    volume = "5",
    pages = "475--500",
    journal = "Journal of Algorithms and Computational Technology",
    issn = "1748-3018",
    publisher = "Multi-Science Publishing Co. Ltd",
    number = "3",

    }

    Inverse Preconditioners for Sparse Matrices using CUDA and GPGPU. / Xu, S.; Xue, W.; Lin, H.X.

    In: Journal of Algorithms and Computational Technology, Vol. 5, No. 3, 2011, p. 475-500.

    Research output: Contribution to journalArticleScientificpeer-review

    TY - JOUR

    T1 - Inverse Preconditioners for Sparse Matrices using CUDA and GPGPU

    AU - Xu, S.

    AU - Xue, W.

    AU - Lin, H.X.

    N1 - Pagination: 26

    PY - 2011

    Y1 - 2011

    N2 - The problem of numerical solution of sparse matrix-based linear systems arises from many scientific applications. Iterative solvers and corresponding preconditioning techniques are usually adopted. Due to the the irregularity in memory access patterns and undeterministic branching behavior of sparse matrix-based algorithms, they present unique problems for emerging GPU-based platforms.This paper discusses the techniques associated with applying GP-GPU and CUDA to the generation of right-looking approximate inverse preconditioners (AINV) and preconditioned GMRES based on them. Parallel algorithms are designed for generating AINV preconditioners with predefined sparsity pattern or dynamic fill-in control themes. Detailed complexity analysis is provided for dynamic fill-in control in preconditioners. Shared Memory-based optimization yield 200% speedup in the sparse vector inner products and sparse vector updates; while overall speedup compared with parallel CPU-based implementations are 3 times and 6 times, respectively. The preconditioned GMRES iteration based on AINV witnessed 7 to 9 times speedup, due to the high efficiency of matrix-vector product based preconditioning. The developed techniques can also be applied to other Krylov solvers and preconditioners.

    AB - The problem of numerical solution of sparse matrix-based linear systems arises from many scientific applications. Iterative solvers and corresponding preconditioning techniques are usually adopted. Due to the the irregularity in memory access patterns and undeterministic branching behavior of sparse matrix-based algorithms, they present unique problems for emerging GPU-based platforms.This paper discusses the techniques associated with applying GP-GPU and CUDA to the generation of right-looking approximate inverse preconditioners (AINV) and preconditioned GMRES based on them. Parallel algorithms are designed for generating AINV preconditioners with predefined sparsity pattern or dynamic fill-in control themes. Detailed complexity analysis is provided for dynamic fill-in control in preconditioners. Shared Memory-based optimization yield 200% speedup in the sparse vector inner products and sparse vector updates; while overall speedup compared with parallel CPU-based implementations are 3 times and 6 times, respectively. The preconditioned GMRES iteration based on AINV witnessed 7 to 9 times speedup, due to the high efficiency of matrix-vector product based preconditioning. The developed techniques can also be applied to other Krylov solvers and preconditioners.

    M3 - Article

    VL - 5

    SP - 475

    EP - 500

    JO - Journal of Algorithms and Computational Technology

    JF - Journal of Algorithms and Computational Technology

    SN - 1748-3018

    IS - 3

    ER -