![]() |
MAGMA 2.9.0
Matrix Algebra for GPU and Multicore Architectures
|
Functions | |
magma_int_t | magma_corderstatistics (magmaFloatComplex *val, magma_int_t length, magma_int_t k, magma_int_t r, magmaFloatComplex *element, magma_queue_t queue) |
Identifies the kth smallest/largest element in an array. | |
magma_int_t | magma_corderstatistics_inc (magmaFloatComplex *val, magma_int_t length, magma_int_t k, magma_int_t inc, magma_int_t r, magmaFloatComplex *element, magma_queue_t queue) |
Approximates the k-th smallest element in an array by using order-statistics with step-size inc. | |
magma_int_t | magma_cmorderstatistics (magmaFloatComplex *val, magma_index_t *col, magma_index_t *row, magma_int_t length, magma_int_t k, magma_int_t r, magmaFloatComplex *element, magma_queue_t queue) |
Identifies the kth smallest/largest element in an array and reorders such that these elements come to the front. | |
magma_int_t | magma_cpartition (magmaFloatComplex *a, magma_int_t size, magma_int_t pivot, magma_queue_t queue) |
magma_int_t | magma_cmedian5 (magmaFloatComplex *a, magma_queue_t queue) |
magma_int_t | magma_cselect (magmaFloatComplex *a, magma_int_t size, magma_int_t k, magma_queue_t queue) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm. | |
magma_int_t | magma_cselectrandom (magmaFloatComplex *a, magma_int_t size, magma_int_t k, magma_queue_t queue) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm. | |
magma_int_t | magma_cdomainoverlap (magma_index_t num_rows, magma_int_t *num_indices, magma_index_t *rowptr, magma_index_t *colidx, magma_index_t *x, magma_queue_t queue) |
Generates the update list. | |
magma_int_t | magma_cvspread (magma_c_matrix *x, const char *filename, magma_queue_t queue) |
Reads in a sparse vector-block stored in COO format. | |
magma_int_t | magma_cdiameter (magma_c_matrix *A, magma_queue_t queue) |
Computes the diameter of a sparse matrix and stores the value in diameter. | |
magma_int_t | magma_cparilusetup (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Prepares the ILU preconditioner via the iterative ILU iteration. | |
magma_int_t | magma_cparilu_gpu (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Generates an ILU(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_cparilu_cpu (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Generates an ILU(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_cparic_gpu (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Generates an IC(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_cparic_cpu (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Generates an IC(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_cparicsetup (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Prepares the IC preconditioner via the iterative IC iteration. | |
magma_int_t | magma_cparicupdate (magma_c_matrix A, magma_c_preconditioner *precond, magma_int_t updates, magma_queue_t queue) |
Updates an existing preconditioner via additional iterative IC sweeps for previous factorization initial guess (PFIG). | |
magma_int_t | magma_capplyiteric_l (magma_c_matrix b, magma_c_matrix *x, magma_c_preconditioner *precond, magma_queue_t queue) |
Performs the left triangular solves using the IC preconditioner via Jacobi. | |
magma_int_t | magma_capplyiteric_r (magma_c_matrix b, magma_c_matrix *x, magma_c_preconditioner *precond, magma_queue_t queue) |
Performs the right triangular solves using the IC preconditioner via Jacobi. | |
magma_int_t | magma_cparilu_csr (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_queue_t queue) |
This routine iteratively computes an incomplete LU factorization. | |
magma_int_t | magma_cpariluupdate (magma_c_matrix A, magma_c_preconditioner *precond, magma_int_t updates, magma_queue_t queue) |
Updates an existing preconditioner via additional iterative ILU sweeps for previous factorization initial guess (PFIG). | |
magma_int_t | magma_cparic_csr (magma_c_matrix A, magma_c_matrix A_CSR, magma_queue_t queue) |
This routine iteratively computes an incomplete LU factorization. | |
magma_int_t | magma_cnonlinres (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_c_matrix *LU, real_Double_t *res, magma_queue_t queue) |
Computes the nonlinear residual A - LU and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_cilures (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_c_matrix *LU, real_Double_t *res, real_Double_t *nonlinres, magma_queue_t queue) |
Computes the ILU residual A - LU and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_cicres (magma_c_matrix A, magma_c_matrix C, magma_c_matrix CT, magma_c_matrix *LU, real_Double_t *res, real_Double_t *nonlinres, magma_queue_t queue) |
Computes the IC residual A - CC^T and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_cinitguess (magma_c_matrix A, magma_c_matrix *L, magma_c_matrix *U, magma_queue_t queue) |
Computes an initial guess for the ParILU/ParIC. | |
magma_int_t | magma_cinitrecursiveLU (magma_c_matrix A, magma_c_matrix *B, magma_queue_t queue) |
Using the iterative approach of computing ILU factorizations with increasing fill-in, it takes the input matrix A, containing the approximate factors, ( L and U as well ) computes a matrix with one higher level of fill-in, inserts the original approximation as initial guess, and provides the factors L and U also filled with the scaled initial guess. | |
magma_int_t | magma_cmLdiagadd (magma_c_matrix *L, magma_queue_t queue) |
Checks for a lower triangular matrix whether it is strictly lower triangular and in the negative case adds a unit diagonal. | |
magma_int_t | magma_cmatrix_cup (magma_c_matrix A, magma_c_matrix B, magma_c_matrix *U, magma_queue_t queue) |
Generates a matrix \(U = A \cup B\). | |
magma_int_t | magma_cmatrix_cup_gpu (magma_c_matrix A, magma_c_matrix B, magma_c_matrix *U, magma_queue_t queue) |
Generates a matrix \(U = A \cup B\). | |
magma_int_t | magma_cmatrix_cap (magma_c_matrix A, magma_c_matrix B, magma_c_matrix *U, magma_queue_t queue) |
Generates a matrix with entries being in both matrices: \(U = A \cap B\). | |
magma_int_t | magma_cmatrix_negcap (magma_c_matrix A, magma_c_matrix B, magma_c_matrix *U, magma_queue_t queue) |
Generates a list of matrix entries being part of A but not of B. | |
magma_int_t | magma_cmatrix_tril_negcap (magma_c_matrix A, magma_c_matrix B, magma_c_matrix *U, magma_queue_t queue) |
Generates a list of matrix entries being part of tril(A) but not of B. | |
magma_int_t | magma_cmatrix_triu_negcap (magma_c_matrix A, magma_c_matrix B, magma_c_matrix *U, magma_queue_t queue) |
Generates a matrix with entries being part of triu(A) but not of B. | |
magma_int_t | magma_cmatrix_abssum (magma_c_matrix A, float *sum, magma_queue_t queue) |
Computes the sum of the absolute values in a matrix. | |
magma_int_t | magma_cparilut_thrsrm (magma_int_t order, magma_c_matrix *A, float *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing. | |
magma_int_t | magma_cparilut_thrsrm_semilinked (magma_c_matrix *U, magma_c_matrix *UT, float *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller thrs from the matrix. | |
magma_int_t | magma_cparilut_rmselected (magma_c_matrix R, magma_c_matrix *A, magma_queue_t queue) |
Removes a selected list of elements from the matrix. | |
magma_int_t | magma_cparilut_selectoneperrow (magma_int_t order, magma_c_matrix *A, magma_c_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_cparilut_selecttwoperrow (magma_int_t order, magma_c_matrix *A, magma_c_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_cparilut_selectoneperrowthrs_lower (magma_c_matrix L, magma_c_matrix U, magma_c_matrix *A, float rtol, magma_c_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_cparilut_selectoneperrowthrs_upper (magma_c_matrix L, magma_c_matrix U, magma_c_matrix *A, float rtol, magma_c_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_cparilut_selectonepercol (magma_int_t order, magma_c_matrix *A, magma_c_matrix *oneA, magma_queue_t queue) |
magma_int_t | magma_cparilut_transpose_select_one (magma_c_matrix A, magma_c_matrix *B, magma_queue_t queue) |
This is a special routine with very limited scope. | |
magma_int_t | magma_cparilut_insert_LU (magma_int_t num_rm, magma_index_t *rm_loc, magma_index_t *rm_loc2, magma_c_matrix *LU_new, magma_c_matrix *L, magma_c_matrix *U, magma_queue_t queue) |
magma_int_t | magma_cparilut_set_thrs (magma_int_t num_rm, magma_c_matrix *LU, magma_int_t order, magmaFloatComplex *thrs, magma_queue_t queue) |
magma_int_t | magma_cparilut_set_approx_thrs (magma_int_t num_rm, magma_c_matrix *LU, magma_int_t order, float *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_cparilut_set_thrs_randomselect (magma_int_t num_rm, magma_c_matrix *LU, magma_int_t order, float *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_cparilut_set_thrs_randomselect_approx (magma_int_t num_rm, magma_c_matrix *LU, magma_int_t order, float *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_cparilut_set_thrs_randomselect_factors (magma_int_t num_rm, magma_c_matrix *L, magma_c_matrix *U, magma_int_t order, float *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_cparilut_set_exact_thrs (magma_int_t num_rm, magma_c_matrix *LU, magma_int_t order, magmaFloatComplex *thrs, magma_queue_t queue) |
This routine provides the exact threshold for removing num_rm elements. | |
magma_int_t | magma_cparilut_set_approx_thrs_inc (magma_int_t num_rm, magma_c_matrix *LU, magma_int_t order, magmaFloatComplex *thrs, magma_queue_t queue) |
This routine provides the exact threshold for removing num_rm elements. | |
magma_int_t | magma_cparilut_LU_approx_thrs (magma_int_t num_rm, magma_c_matrix *L, magma_c_matrix *U, magma_int_t order, magmaFloatComplex *thrs, magma_queue_t queue) |
magma_int_t | magma_cparilut_reorder (magma_c_matrix *LU, magma_queue_t queue) |
This routine reorders the matrix (inplace) for easier access. | |
magma_int_t | magma_cparict_sweep (magma_c_matrix *A, magma_c_matrix *LU, magma_queue_t queue) |
magma_int_t | magma_cparilut_zero (magma_c_matrix *A, magma_queue_t queue) |
magma_int_t | magma_cparilu_sweep (magma_c_matrix A, magma_c_matrix *L, magma_c_matrix *U, magma_queue_t queue) |
This function does one asynchronous ParILU sweep. | |
magma_int_t | magma_cparilu_sweep_sync (magma_c_matrix A, magma_c_matrix *L, magma_c_matrix *U, magma_queue_t queue) |
This function does one synchronized ParILU sweep. | |
magma_int_t | magma_cparic_sweep (magma_c_matrix A, magma_c_matrix *L, magma_queue_t queue) |
This function does one asynchronous ParILU sweep (symmetric case). | |
magma_int_t | magma_cparic_sweep_sync (magma_c_matrix A, magma_c_matrix *L, magma_queue_t queue) |
This function does one synchronized ParILU sweep (symmetric case). | |
magma_int_t | magma_cparict_sweep_sync (magma_c_matrix *A, magma_c_matrix *L, magma_queue_t queue) |
This function does one synchronized ParILU sweep. | |
magma_int_t | magma_cparilut_sweep_sync (magma_c_matrix *A, magma_c_matrix *L, magma_c_matrix *U, magma_queue_t queue) |
This function does an ParILUT sweep. | |
magma_int_t | magma_cparilut_sweep_gpu (magma_c_matrix *A, magma_c_matrix *L, magma_c_matrix *U, magma_queue_t queue) |
This function does an ParILUT sweep. | |
magma_int_t | magma_cparilut_residuals_gpu (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_c_matrix *R, magma_queue_t queue) |
This function computes the ILU residual in the locations included in the sparsity pattern of R. | |
magma_int_t | magma_cthrsholdrm_gpu (magma_int_t order, magma_c_matrix *A, float *thrs, magma_queue_t queue) |
magma_int_t | magma_cget_row_ptr (const magma_int_t num_rows, magma_int_t *nnz, const magma_index_t *rowidx, magma_index_t *rowptr, magma_queue_t queue) |
magma_int_t | magma_cparilut_align_residuals (magma_c_matrix L, magma_c_matrix U, magma_c_matrix *Lnew, magma_c_matrix *Unew, magma_queue_t queue) |
This function scales the residuals of a lower triangular factor L with the diagonal of U. | |
magma_int_t | magma_cparilut_preselect_scale (magma_c_matrix *L, magma_c_matrix *oneL, magma_c_matrix *U, magma_c_matrix *oneU, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_cparilut_thrsrm_U (magma_int_t order, magma_c_matrix L, magma_c_matrix *A, float *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing. | |
magma_int_t | magma_cparilut_residuals (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_c_matrix *L_new, magma_queue_t queue) |
This function computes the ILU residual in the locations included in the sparsity pattern of R. | |
magma_int_t | magma_cparilut_residuals_transpose (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_c_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_cparilut_residuals_semilinked (magma_c_matrix A, magma_c_matrix L, magma_c_matrix US, magma_c_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_cparilut_sweep_semilinked (magma_c_matrix *A, magma_c_matrix *L, magma_c_matrix *US, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_cparilut_sweep_list (magma_c_matrix *A, magma_c_matrix *L, magma_c_matrix *U, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_cparilut_residuals_list (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_c_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_cparilut_sweep_linkedlist (magma_c_matrix *A, magma_c_matrix *L, magma_c_matrix *U, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_cparilut_residuals_linkedlist (magma_c_matrix A, magma_c_matrix L, magma_c_matrix U, magma_c_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_cparilut_colmajor (magma_c_matrix A, magma_c_matrix *AC, magma_queue_t queue) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix. | |
magma_int_t | magma_cparilut_colmajorup (magma_c_matrix A, magma_c_matrix *AC, magma_queue_t queue) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix. | |
magma_int_t | magma_cparict (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Prepares the iterative threshold Incomplete Cholesky preconditioner. | |
magma_int_t | magma_cparict_cpu (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold Cholesky preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_cparilut (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Prepares the iterative threshold Incomplete LU preconditioner. | |
magma_int_t | magma_cparilut_cpu (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_cparilut_gpu (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_cparilut_gpu_nodp (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_cparilut_insert (magma_int_t *num_rmL, magma_int_t *num_rmU, magma_index_t *rm_locL, magma_index_t *rm_locU, magma_c_matrix *L_new, magma_c_matrix *U_new, magma_c_matrix *L, magma_c_matrix *U, magma_c_matrix *UR, magma_queue_t queue) |
Inserts for the iterative dynamic ILU an new element in the (empty) place. | |
magma_int_t | magma_cparilut_create_collinkedlist (magma_c_matrix A, magma_c_matrix *B, magma_queue_t queue) |
For the matrix U in CSR (row-major) this creates B containing a row-ptr to the columns and a linked list for the elements. | |
magma_int_t | magma_cparilut_candidates (magma_c_matrix L0, magma_c_matrix U0, magma_c_matrix L, magma_c_matrix U, magma_c_matrix *L_new, magma_c_matrix *U_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_cparilut_candidates_gpu (magma_c_matrix L0, magma_c_matrix U0, magma_c_matrix L, magma_c_matrix U, magma_c_matrix *L_new, magma_c_matrix *U_new, magma_queue_t queue) |
This function identifies the locations with a potential nonzero ILU residual R = A - L*U where L and U are the current incomplete factors. | |
magma_int_t | magma_cparict_candidates (magma_c_matrix L0, magma_c_matrix L, magma_c_matrix LT, magma_c_matrix *L_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_cparilut_candidates_semilinked (magma_c_matrix L0, magma_c_matrix U0, magma_c_matrix L, magma_c_matrix U, magma_c_matrix UT, magma_c_matrix *L_new, magma_c_matrix *U_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_cparilut_candidates_linkedlist (magma_c_matrix L0, magma_c_matrix U0, magma_c_matrix L, magma_c_matrix U, magma_c_matrix UR, magma_c_matrix *L_new, magma_c_matrix *U_new, magma_queue_t queue) |
magma_int_t | magma_cparilut_rm_thrs (float *thrs, magma_int_t *num_rm, magma_c_matrix *LU, magma_c_matrix *LU_new, magma_index_t *rm_loc, magma_queue_t queue) |
This routine removes matrix entries from the structure that are smaller than the threshold. | |
magma_int_t | magma_cparilut_count (magma_c_matrix L, magma_int_t *num, magma_queue_t queue) |
This is a helper routine counting elements in a matrix in unordered Magma_CSRLIST format. | |
magma_int_t | magma_cparilut_randlist (magma_c_matrix *LU, magma_queue_t queue) |
magma_int_t | magma_cparilut_select_candidates_L (magma_int_t *num_rm, magma_index_t *rm_loc, magma_c_matrix *L_new, magma_queue_t queue) |
Screens the new candidates for multiple elements in the same row. | |
magma_int_t | magma_cparilut_select_candidates_U (magma_int_t *num_rm, magma_index_t *rm_loc, magma_c_matrix *L_new, magma_queue_t queue) |
Screens the new candidates for multiple elements in the same row. | |
magma_int_t | magma_cparilut_preselect (magma_int_t order, magma_c_matrix *A, magma_c_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_cpreselect_gpu (magma_int_t order, magma_c_matrix *A, magma_c_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_csampleselect (magma_int_t total_size, magma_int_t subset_size, magmaFloatComplex *val, float *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_csampleselect_approx (magma_int_t total_size, magma_int_t subset_size, magmaFloatComplex *val, float *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_csampleselect_nodp (magma_int_t total_size, magma_int_t subset_size, magmaFloatComplex *val, float *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_csampleselect_approx_nodp (magma_int_t total_size, magma_int_t subset_size, magmaFloatComplex *val, float *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_cmprepare_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_c_matrix L, magma_c_matrix LC, magma_index_t *sizes, magma_index_t *locations, magmaFloatComplex *trisystems, magmaFloatComplex *rhs, magma_queue_t queue) |
Takes a sparse matrix and generates an array containing the sizes of the different systems an array containing the indices with the locations in the sparse matrix where the data comes from and goes back to an array containing all the sparse triangular systems. | |
magma_int_t | magma_cmtrisolve_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_c_matrix L, magma_c_matrix LC, magma_index_t *sizes, magma_index_t *locations, magmaFloatComplex *trisystems, magmaFloatComplex *rhs, magma_queue_t queue) |
Does all triangular solves. | |
magma_int_t | magma_cmbackinsert_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_c_matrix *M, magma_index_t *sizes, magma_index_t *locations, magmaFloatComplex *trisystems, magmaFloatComplex *rhs, magma_queue_t queue) |
Inserts the values into the preconditioner matrix. | |
magma_int_t | magma_cmprepare_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_c_matrix L, magma_c_matrix LC, magma_index_t *sizes, magma_index_t *locations, magmaFloatComplex *trisystems, magmaFloatComplex *rhs, magma_queue_t queue) |
magma_int_t | magma_cmtrisolve_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_c_matrix L, magma_c_matrix LC, magma_index_t *sizes, magma_index_t *locations, magmaFloatComplex *trisystems, magmaFloatComplex *rhs, magma_queue_t queue) |
Does all triangular solves. | |
magma_int_t | magma_cmbackinsert_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_c_matrix *M, magma_index_t *sizes, magma_index_t *locations, magmaFloatComplex *trisystems, magmaFloatComplex *rhs, magma_queue_t queue) |
magma_int_t | magma_ciluisaisetup_lower (magma_c_matrix L, magma_c_matrix S, magma_c_matrix *ISAIL, magma_queue_t queue) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves. | |
magma_int_t | magma_ciluisaisetup_upper (magma_c_matrix U, magma_c_matrix S, magma_c_matrix *ISAIU, magma_queue_t queue) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves. | |
magma_int_t | magma_cicisaisetup (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_cisai_l (magma_c_matrix b, magma_c_matrix *x, magma_c_preconditioner *precond, magma_queue_t queue) |
Left-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_cisai_r (magma_c_matrix b, magma_c_matrix *x, magma_c_preconditioner *precond, magma_queue_t queue) |
Right-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_cisai_l_t (magma_c_matrix b, magma_c_matrix *x, magma_c_preconditioner *precond, magma_queue_t queue) |
Left-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_cisai_r_t (magma_c_matrix b, magma_c_matrix *x, magma_c_preconditioner *precond, magma_queue_t queue) |
Right-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_cmiluisai_sizecheck (magma_c_matrix A, magma_index_t batchsize, magma_index_t *maxsize, magma_queue_t queue) |
magma_int_t | magma_cgeisai_maxblock (magma_c_matrix L, magma_c_matrix *MT, magma_queue_t queue) |
This routine maximizes the pattern for the ISAI preconditioner. | |
magma_int_t | magma_cisai_generator_regs (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_c_matrix L, magma_c_matrix *M, magma_queue_t queue) |
This routine is designet to combine all kernels into one. | |
magma_int_t | magma_cmsupernodal (magma_int_t *max_bs, magma_c_matrix A, magma_c_matrix *S, magma_queue_t queue) |
Generates a block-diagonal sparsity pattern with block-size bs. | |
magma_int_t | magma_cmvarsizeblockstruct (magma_int_t n, magma_int_t *bs, magma_int_t bsl, magma_uplo_t uplotype, magma_c_matrix *A, magma_queue_t queue) |
Generates a block-diagonal sparsity pattern with variable block-size. | |
magma_int_t | magma_ctfqmr_unrolled (magma_c_matrix A, magma_c_matrix b, magma_c_matrix *x, magma_c_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a complex matrix A. | |
magma_int_t | magma_cbicgstab_merge2 (magma_c_matrix A, magma_c_matrix b, magma_c_matrix *x, magma_c_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a general matrix. | |
magma_int_t | magma_cbicgstab_merge3 (magma_c_matrix A, magma_c_matrix b, magma_c_matrix *x, magma_c_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a general matrix. | |
magma_int_t | magma_cjacobidomainoverlap (magma_c_matrix A, magma_c_matrix b, magma_c_matrix *x, magma_c_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a complex Hermitian N-by-N positive definite matrix A. | |
magma_int_t | magma_cbaiter (magma_c_matrix A, magma_c_matrix b, magma_c_matrix *x, magma_c_solver_par *solver_par, magma_c_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU. | |
magma_int_t | magma_cbaiter_overlap (magma_c_matrix A, magma_c_matrix b, magma_c_matrix *x, magma_c_solver_par *solver_par, magma_c_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU. | |
magma_int_t | magma_cftjacobicontractions (magma_c_matrix xkm2, magma_c_matrix xkm1, magma_c_matrix xk, magma_c_matrix *z, magma_c_matrix *c, magma_queue_t queue) |
Computes the contraction coefficients c_i: | |
magma_int_t | magma_cftjacobiupdatecheck (float delta, magma_c_matrix *xold, magma_c_matrix *xnew, magma_c_matrix *zprev, magma_c_matrix c, magma_int_t *flag_t, magma_int_t *flag_fp, magma_queue_t queue) |
Checks the Jacobi updates accorting to the condition in the ScaLA'15 paper. | |
magma_int_t | magma_citerref (magma_c_matrix A, magma_c_matrix b, magma_c_matrix *x, magma_c_solver_par *solver_par, magma_c_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a complex Hermitian N-by-N positive definite matrix A. | |
magma_int_t | magma_cjacobiiter_sys (magma_c_matrix A, magma_c_matrix b, magma_c_matrix d, magma_c_matrix t, magma_c_matrix *x, magma_c_solver_par *solver_par, magma_queue_t queue) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. | |
magma_int_t | magma_cftjacobi (magma_c_matrix A, magma_c_matrix b, magma_c_matrix *x, magma_c_solver_par *solver_par, magma_queue_t queue) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. | |
magma_int_t | magma_cilut_saad (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_cilut_saad_apply (magma_c_matrix b, magma_c_matrix *x, magma_c_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_ccustomilusetup (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Reads in an Incomplete LU preconditioner. | |
magma_int_t | magma_ccustomicsetup (magma_c_matrix A, magma_c_matrix b, magma_c_preconditioner *precond, magma_queue_t queue) |
Reads in an Incomplete Cholesky preconditioner. | |
magma_int_t | magma_cbajac_csr (magma_int_t localiters, magma_c_matrix D, magma_c_matrix R, magma_c_matrix b, magma_c_matrix *x, magma_queue_t queue) |
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block. | |
magma_int_t | magma_cbajac_csr_overlap (magma_int_t localiters, magma_int_t matrices, magma_int_t overlap, magma_c_matrix *D, magma_c_matrix *R, magma_c_matrix b, magma_c_matrix *x, magma_queue_t queue) |
This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block. | |
magma_int_t | magma_cmlumerge (magma_c_matrix L, magma_c_matrix U, magma_c_matrix *A, magma_queue_t queue) |
Takes an strictly lower triangular matrix L and an upper triangular matrix U and merges them into a matrix A containing the upper and lower triangular parts. | |
magma_int_t | magma_cgeaxpy (magmaFloatComplex alpha, magma_c_matrix X, magmaFloatComplex beta, magma_c_matrix *Y, magma_queue_t queue) |
This routine computes Y = alpha * X + beta * Y on the GPU. | |
magma_int_t | magma_cgecsrreimsplit (magma_c_matrix A, magma_c_matrix *ReA, magma_c_matrix *ImA, magma_queue_t queue) |
This routine takes an input matrix A in CSR format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A. | |
magma_int_t | magma_cgedensereimsplit (magma_c_matrix A, magma_c_matrix *ReA, magma_c_matrix *ImA, magma_queue_t queue) |
This routine takes an input matrix A in DENSE format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A. | |
magma_int_t | magma_cgecsr5mv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t p, magmaFloatComplex alpha, magma_int_t sigma, magma_int_t bit_y_offset, magma_int_t bit_scansum_offset, magma_int_t num_packet, magmaUIndex_ptr dtile_ptr, magmaUIndex_ptr dtile_desc, magmaIndex_ptr dtile_desc_offset_ptr, magmaIndex_ptr dtile_desc_offset, magmaFloatComplex_ptr dcalibrator, magma_int_t tail_tile_start, magmaFloatComplex_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaFloatComplex_ptr dx, magmaFloatComplex beta, magmaFloatComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t | magma_ccopyscale (magma_int_t n, magma_int_t k, magmaFloatComplex_ptr dr, magmaFloatComplex_ptr dv, magmaFloatComplex_ptr dskp, magma_queue_t queue) |
Computes the correction term of the pipelined GMRES according to P. | |
magma_int_t | magma_scnrm2scale (magma_int_t m, magmaFloatComplex_ptr dr, magma_int_t lddr, magmaFloatComplex *drnorm, magma_queue_t queue) |
magma_int_t | magma_cjacobispmvupdate_bw (magma_int_t maxiter, magma_c_matrix A, magma_c_matrix t, magma_c_matrix b, magma_c_matrix d, magma_c_matrix *x, magma_queue_t queue) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d. | |
magma_int_t | magma_cjacobispmvupdateselect (magma_int_t maxiter, magma_int_t num_updates, magma_index_t *indices, magma_c_matrix A, magma_c_matrix t, magma_c_matrix b, magma_c_matrix d, magma_c_matrix tmp, magma_c_matrix *x, magma_queue_t queue) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d. | |
magma_int_t | magma_cmergeblockkrylov (magma_int_t num_rows, magma_int_t num_cols, magmaFloatComplex_ptr alpha, magmaFloatComplex_ptr p, magmaFloatComplex_ptr x, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_cbicgmerge1 (magma_int_t n, magmaFloatComplex_ptr dskp, magmaFloatComplex_ptr dv, magmaFloatComplex_ptr dr, magmaFloatComplex_ptr dp, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_cbicgmerge2 (magma_int_t n, magmaFloatComplex_ptr dskp, magmaFloatComplex_ptr dr, magmaFloatComplex_ptr dv, magmaFloatComplex_ptr ds, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_cbicgmerge3 (magma_int_t n, magmaFloatComplex_ptr dskp, magmaFloatComplex_ptr dp, magmaFloatComplex_ptr ds, magmaFloatComplex_ptr dt, magmaFloatComplex_ptr dx, magmaFloatComplex_ptr dr, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_cbicgmerge4 (magma_int_t type, magmaFloatComplex_ptr dskp, magma_queue_t queue) |
Performs some parameter operations for the BiCGSTAB with scalars on GPU. | |
magma_int_t | magma_cbicgmerge_spmv1 (magma_c_matrix A, magmaFloatComplex_ptr dd1, magmaFloatComplex_ptr dd2, magmaFloatComplex_ptr dp, magmaFloatComplex_ptr dr, magmaFloatComplex_ptr dv, magmaFloatComplex_ptr dskp, magma_queue_t queue) |
Merges the first SpmV using CSR with the dot product and the computation of alpha. | |
magma_int_t | magma_cbicgmerge_spmv2 (magma_c_matrix A, magmaFloatComplex_ptr dd1, magmaFloatComplex_ptr dd2, magmaFloatComplex_ptr ds, magmaFloatComplex_ptr dt, magmaFloatComplex_ptr dskp, magma_queue_t queue) |
Merges the second SpmV using CSR with the dot product and the computation of omega. | |
magma_int_t | magma_cbicgmerge_xrbeta (magma_int_t n, magmaFloatComplex_ptr dd1, magmaFloatComplex_ptr dd2, magmaFloatComplex_ptr drr, magmaFloatComplex_ptr dr, magmaFloatComplex_ptr dp, magmaFloatComplex_ptr ds, magmaFloatComplex_ptr dt, magmaFloatComplex_ptr dx, magmaFloatComplex_ptr dskp, magma_queue_t queue) |
Merges the second SpmV using CSR with the dot product and the computation of omega. | |
magma_int_t | magma_cbcsrswp (magma_int_t n, magma_int_t size_b, magma_int_t *ipiv, magmaFloatComplex_ptr dx, magma_queue_t queue) |
magma_int_t | magma_cbcsrtrsv (magma_uplo_t uplo, magma_int_t r_blocks, magma_int_t c_blocks, magma_int_t size_b, magmaFloatComplex_ptr dA, magma_index_t *blockinfo, magmaFloatComplex_ptr dx, magma_queue_t queue) |
magma_int_t | magma_cbcsrvalcpy (magma_int_t size_b, magma_int_t num_blocks, magma_int_t num_zero_blocks, magmaFloatComplex_ptr *dAval, magmaFloatComplex_ptr *dBval, magmaFloatComplex_ptr *dBval2, magma_queue_t queue) |
magma_int_t | magma_cbcsrluegemm (magma_int_t size_b, magma_int_t num_block_rows, magma_int_t kblocks, magmaFloatComplex_ptr *dA, magmaFloatComplex_ptr *dB, magmaFloatComplex_ptr *dC, magma_queue_t queue) |
magma_int_t | magma_cbcsrlupivloc (magma_int_t size_b, magma_int_t kblocks, magmaFloatComplex_ptr *dA, magma_int_t *ipiv, magma_queue_t queue) |
magma_int_t | magma_cbcsrblockinfo5 (magma_int_t lustep, magma_int_t num_blocks, magma_int_t c_blocks, magma_int_t size_b, magma_index_t *blockinfo, magmaFloatComplex_ptr dval, magmaFloatComplex_ptr *AII, magma_queue_t queue) |
magma_int_t | magma_cthrsholdselect (magma_int_t sampling, magma_int_t total_size, magma_int_t subset_size, magmaFloatComplex *val, float *thrs, magma_queue_t queue) |
magma_int_t | magma_dorderstatistics (double *val, magma_int_t length, magma_int_t k, magma_int_t r, double *element, magma_queue_t queue) |
Identifies the kth smallest/largest element in an array. | |
magma_int_t | magma_dorderstatistics_inc (double *val, magma_int_t length, magma_int_t k, magma_int_t inc, magma_int_t r, double *element, magma_queue_t queue) |
Approximates the k-th smallest element in an array by using order-statistics with step-size inc. | |
magma_int_t | magma_dmorderstatistics (double *val, magma_index_t *col, magma_index_t *row, magma_int_t length, magma_int_t k, magma_int_t r, double *element, magma_queue_t queue) |
Identifies the kth smallest/largest element in an array and reorders such that these elements come to the front. | |
magma_int_t | magma_dpartition (double *a, magma_int_t size, magma_int_t pivot, magma_queue_t queue) |
magma_int_t | magma_dmedian5 (double *a, magma_queue_t queue) |
magma_int_t | magma_dselect (double *a, magma_int_t size, magma_int_t k, magma_queue_t queue) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm. | |
magma_int_t | magma_dselectrandom (double *a, magma_int_t size, magma_int_t k, magma_queue_t queue) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm. | |
magma_int_t | magma_ddomainoverlap (magma_index_t num_rows, magma_int_t *num_indices, magma_index_t *rowptr, magma_index_t *colidx, magma_index_t *x, magma_queue_t queue) |
Generates the update list. | |
magma_int_t | magma_dvspread (magma_d_matrix *x, const char *filename, magma_queue_t queue) |
Reads in a sparse vector-block stored in COO format. | |
magma_int_t | magma_ddiameter (magma_d_matrix *A, magma_queue_t queue) |
Computes the diameter of a sparse matrix and stores the value in diameter. | |
magma_int_t | magma_dparilusetup (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Prepares the ILU preconditioner via the iterative ILU iteration. | |
magma_int_t | magma_dparilu_gpu (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Generates an ILU(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_dparilu_cpu (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Generates an ILU(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_dparic_gpu (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Generates an IC(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_dparic_cpu (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Generates an IC(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_dparicsetup (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Prepares the IC preconditioner via the iterative IC iteration. | |
magma_int_t | magma_dparicupdate (magma_d_matrix A, magma_d_preconditioner *precond, magma_int_t updates, magma_queue_t queue) |
Updates an existing preconditioner via additional iterative IC sweeps for previous factorization initial guess (PFIG). | |
magma_int_t | magma_dapplyiteric_l (magma_d_matrix b, magma_d_matrix *x, magma_d_preconditioner *precond, magma_queue_t queue) |
Performs the left triangular solves using the IC preconditioner via Jacobi. | |
magma_int_t | magma_dapplyiteric_r (magma_d_matrix b, magma_d_matrix *x, magma_d_preconditioner *precond, magma_queue_t queue) |
Performs the right triangular solves using the IC preconditioner via Jacobi. | |
magma_int_t | magma_dparilu_csr (magma_d_matrix A, magma_d_matrix L, magma_d_matrix U, magma_queue_t queue) |
This routine iteratively computes an incomplete LU factorization. | |
magma_int_t | magma_dpariluupdate (magma_d_matrix A, magma_d_preconditioner *precond, magma_int_t updates, magma_queue_t queue) |
Updates an existing preconditioner via additional iterative ILU sweeps for previous factorization initial guess (PFIG). | |
magma_int_t | magma_dparic_csr (magma_d_matrix A, magma_d_matrix A_CSR, magma_queue_t queue) |
This routine iteratively computes an incomplete LU factorization. | |
magma_int_t | magma_dnonlinres (magma_d_matrix A, magma_d_matrix L, magma_d_matrix U, magma_d_matrix *LU, real_Double_t *res, magma_queue_t queue) |
Computes the nonlinear residual A - LU and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_dilures (magma_d_matrix A, magma_d_matrix L, magma_d_matrix U, magma_d_matrix *LU, real_Double_t *res, real_Double_t *nonlinres, magma_queue_t queue) |
Computes the ILU residual A - LU and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_dicres (magma_d_matrix A, magma_d_matrix C, magma_d_matrix CT, magma_d_matrix *LU, real_Double_t *res, real_Double_t *nonlinres, magma_queue_t queue) |
Computes the IC residual A - CC^T and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_dinitguess (magma_d_matrix A, magma_d_matrix *L, magma_d_matrix *U, magma_queue_t queue) |
Computes an initial guess for the ParILU/ParIC. | |
magma_int_t | magma_dinitrecursiveLU (magma_d_matrix A, magma_d_matrix *B, magma_queue_t queue) |
Using the iterative approach of computing ILU factorizations with increasing fill-in, it takes the input matrix A, containing the approximate factors, ( L and U as well ) computes a matrix with one higher level of fill-in, inserts the original approximation as initial guess, and provides the factors L and U also filled with the scaled initial guess. | |
magma_int_t | magma_dmLdiagadd (magma_d_matrix *L, magma_queue_t queue) |
Checks for a lower triangular matrix whether it is strictly lower triangular and in the negative case adds a unit diagonal. | |
magma_int_t | magma_dmatrix_cup (magma_d_matrix A, magma_d_matrix B, magma_d_matrix *U, magma_queue_t queue) |
Generates a matrix \(U = A \cup B\). | |
magma_int_t | magma_dmatrix_cup_gpu (magma_d_matrix A, magma_d_matrix B, magma_d_matrix *U, magma_queue_t queue) |
Generates a matrix \(U = A \cup B\). | |
magma_int_t | magma_dmatrix_cap (magma_d_matrix A, magma_d_matrix B, magma_d_matrix *U, magma_queue_t queue) |
Generates a matrix with entries being in both matrices: \(U = A \cap B\). | |
magma_int_t | magma_dmatrix_negcap (magma_d_matrix A, magma_d_matrix B, magma_d_matrix *U, magma_queue_t queue) |
Generates a list of matrix entries being part of A but not of B. | |
magma_int_t | magma_dmatrix_tril_negcap (magma_d_matrix A, magma_d_matrix B, magma_d_matrix *U, magma_queue_t queue) |
Generates a list of matrix entries being part of tril(A) but not of B. | |
magma_int_t | magma_dmatrix_triu_negcap (magma_d_matrix A, magma_d_matrix B, magma_d_matrix *U, magma_queue_t queue) |
Generates a matrix with entries being part of triu(A) but not of B. | |
magma_int_t | magma_dmatrix_abssum (magma_d_matrix A, double *sum, magma_queue_t queue) |
Computes the sum of the absolute values in a matrix. | |
magma_int_t | magma_dparilut_thrsrm (magma_int_t order, magma_d_matrix *A, double *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing. | |
magma_int_t | magma_dparilut_thrsrm_semilinked (magma_d_matrix *U, magma_d_matrix *UT, double *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller thrs from the matrix. | |
magma_int_t | magma_dparilut_rmselected (magma_d_matrix R, magma_d_matrix *A, magma_queue_t queue) |
Removes a selected list of elements from the matrix. | |
magma_int_t | magma_dparilut_selectoneperrow (magma_int_t order, magma_d_matrix *A, magma_d_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_dparilut_selecttwoperrow (magma_int_t order, magma_d_matrix *A, magma_d_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_dparilut_selectoneperrowthrs_lower (magma_d_matrix L, magma_d_matrix U, magma_d_matrix *A, double rtol, magma_d_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_dparilut_selectoneperrowthrs_upper (magma_d_matrix L, magma_d_matrix U, magma_d_matrix *A, double rtol, magma_d_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_dparilut_selectonepercol (magma_int_t order, magma_d_matrix *A, magma_d_matrix *oneA, magma_queue_t queue) |
magma_int_t | magma_dparilut_transpose_select_one (magma_d_matrix A, magma_d_matrix *B, magma_queue_t queue) |
This is a special routine with very limited scope. | |
magma_int_t | magma_dparilut_insert_LU (magma_int_t num_rm, magma_index_t *rm_loc, magma_index_t *rm_loc2, magma_d_matrix *LU_new, magma_d_matrix *L, magma_d_matrix *U, magma_queue_t queue) |
magma_int_t | magma_dparilut_set_thrs (magma_int_t num_rm, magma_d_matrix *LU, magma_int_t order, double *thrs, magma_queue_t queue) |
magma_int_t | magma_dparilut_set_approx_thrs (magma_int_t num_rm, magma_d_matrix *LU, magma_int_t order, double *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_dparilut_set_thrs_randomselect (magma_int_t num_rm, magma_d_matrix *LU, magma_int_t order, double *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_dparilut_set_thrs_randomselect_approx (magma_int_t num_rm, magma_d_matrix *LU, magma_int_t order, double *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_dparilut_set_thrs_randomselect_factors (magma_int_t num_rm, magma_d_matrix *L, magma_d_matrix *U, magma_int_t order, double *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_dparilut_set_exact_thrs (magma_int_t num_rm, magma_d_matrix *LU, magma_int_t order, double *thrs, magma_queue_t queue) |
This routine provides the exact threshold for removing num_rm elements. | |
magma_int_t | magma_dparilut_set_approx_thrs_inc (magma_int_t num_rm, magma_d_matrix *LU, magma_int_t order, double *thrs, magma_queue_t queue) |
This routine provides the exact threshold for removing num_rm elements. | |
magma_int_t | magma_dparilut_LU_approx_thrs (magma_int_t num_rm, magma_d_matrix *L, magma_d_matrix *U, magma_int_t order, double *thrs, magma_queue_t queue) |
magma_int_t | magma_dparilut_reorder (magma_d_matrix *LU, magma_queue_t queue) |
This routine reorders the matrix (inplace) for easier access. | |
magma_int_t | magma_dparict_sweep (magma_d_matrix *A, magma_d_matrix *LU, magma_queue_t queue) |
magma_int_t | magma_dparilut_zero (magma_d_matrix *A, magma_queue_t queue) |
magma_int_t | magma_dparilu_sweep (magma_d_matrix A, magma_d_matrix *L, magma_d_matrix *U, magma_queue_t queue) |
This function does one asynchronous ParILU sweep. | |
magma_int_t | magma_dparilu_sweep_sync (magma_d_matrix A, magma_d_matrix *L, magma_d_matrix *U, magma_queue_t queue) |
This function does one synchronized ParILU sweep. | |
magma_int_t | magma_dparic_sweep (magma_d_matrix A, magma_d_matrix *L, magma_queue_t queue) |
This function does one asynchronous ParILU sweep (symmetric case). | |
magma_int_t | magma_dparic_sweep_sync (magma_d_matrix A, magma_d_matrix *L, magma_queue_t queue) |
This function does one synchronized ParILU sweep (symmetric case). | |
magma_int_t | magma_dparict_sweep_sync (magma_d_matrix *A, magma_d_matrix *L, magma_queue_t queue) |
This function does one synchronized ParILU sweep. | |
magma_int_t | magma_dparilut_sweep_sync (magma_d_matrix *A, magma_d_matrix *L, magma_d_matrix *U, magma_queue_t queue) |
This function does an ParILUT sweep. | |
magma_int_t | magma_dparilut_sweep_gpu (magma_d_matrix *A, magma_d_matrix *L, magma_d_matrix *U, magma_queue_t queue) |
This function does an ParILUT sweep. | |
magma_int_t | magma_dparilut_residuals_gpu (magma_d_matrix A, magma_d_matrix L, magma_d_matrix U, magma_d_matrix *R, magma_queue_t queue) |
This function computes the ILU residual in the locations included in the sparsity pattern of R. | |
magma_int_t | magma_dthrsholdrm_gpu (magma_int_t order, magma_d_matrix *A, double *thrs, magma_queue_t queue) |
magma_int_t | magma_dget_row_ptr (const magma_int_t num_rows, magma_int_t *nnz, const magma_index_t *rowidx, magma_index_t *rowptr, magma_queue_t queue) |
magma_int_t | magma_dparilut_align_residuals (magma_d_matrix L, magma_d_matrix U, magma_d_matrix *Lnew, magma_d_matrix *Unew, magma_queue_t queue) |
This function scales the residuals of a lower triangular factor L with the diagonal of U. | |
magma_int_t | magma_dparilut_preselect_scale (magma_d_matrix *L, magma_d_matrix *oneL, magma_d_matrix *U, magma_d_matrix *oneU, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_dparilut_thrsrm_U (magma_int_t order, magma_d_matrix L, magma_d_matrix *A, double *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing. | |
magma_int_t | magma_dparilut_residuals (magma_d_matrix A, magma_d_matrix L, magma_d_matrix U, magma_d_matrix *L_new, magma_queue_t queue) |
This function computes the ILU residual in the locations included in the sparsity pattern of R. | |
magma_int_t | magma_dparilut_residuals_transpose (magma_d_matrix A, magma_d_matrix L, magma_d_matrix U, magma_d_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_dparilut_residuals_semilinked (magma_d_matrix A, magma_d_matrix L, magma_d_matrix US, magma_d_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_dparilut_sweep_semilinked (magma_d_matrix *A, magma_d_matrix *L, magma_d_matrix *US, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_dparilut_sweep_list (magma_d_matrix *A, magma_d_matrix *L, magma_d_matrix *U, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_dparilut_residuals_list (magma_d_matrix A, magma_d_matrix L, magma_d_matrix U, magma_d_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_dparilut_sweep_linkedlist (magma_d_matrix *A, magma_d_matrix *L, magma_d_matrix *U, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_dparilut_residuals_linkedlist (magma_d_matrix A, magma_d_matrix L, magma_d_matrix U, magma_d_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_dparilut_colmajor (magma_d_matrix A, magma_d_matrix *AC, magma_queue_t queue) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix. | |
magma_int_t | magma_dparilut_colmajorup (magma_d_matrix A, magma_d_matrix *AC, magma_queue_t queue) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix. | |
magma_int_t | magma_dparict (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Prepares the iterative threshold Incomplete Cholesky preconditioner. | |
magma_int_t | magma_dparict_cpu (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold Cholesky preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_dparilut (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Prepares the iterative threshold Incomplete LU preconditioner. | |
magma_int_t | magma_dparilut_cpu (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_dparilut_gpu (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_dparilut_gpu_nodp (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_dparilut_insert (magma_int_t *num_rmL, magma_int_t *num_rmU, magma_index_t *rm_locL, magma_index_t *rm_locU, magma_d_matrix *L_new, magma_d_matrix *U_new, magma_d_matrix *L, magma_d_matrix *U, magma_d_matrix *UR, magma_queue_t queue) |
Inserts for the iterative dynamic ILU an new element in the (empty) place. | |
magma_int_t | magma_dparilut_create_collinkedlist (magma_d_matrix A, magma_d_matrix *B, magma_queue_t queue) |
For the matrix U in CSR (row-major) this creates B containing a row-ptr to the columns and a linked list for the elements. | |
magma_int_t | magma_dparilut_candidates (magma_d_matrix L0, magma_d_matrix U0, magma_d_matrix L, magma_d_matrix U, magma_d_matrix *L_new, magma_d_matrix *U_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_dparilut_candidates_gpu (magma_d_matrix L0, magma_d_matrix U0, magma_d_matrix L, magma_d_matrix U, magma_d_matrix *L_new, magma_d_matrix *U_new, magma_queue_t queue) |
This function identifies the locations with a potential nonzero ILU residual R = A - L*U where L and U are the current incomplete factors. | |
magma_int_t | magma_dparict_candidates (magma_d_matrix L0, magma_d_matrix L, magma_d_matrix LT, magma_d_matrix *L_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_dparilut_candidates_semilinked (magma_d_matrix L0, magma_d_matrix U0, magma_d_matrix L, magma_d_matrix U, magma_d_matrix UT, magma_d_matrix *L_new, magma_d_matrix *U_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_dparilut_candidates_linkedlist (magma_d_matrix L0, magma_d_matrix U0, magma_d_matrix L, magma_d_matrix U, magma_d_matrix UR, magma_d_matrix *L_new, magma_d_matrix *U_new, magma_queue_t queue) |
magma_int_t | magma_dparilut_rm_thrs (double *thrs, magma_int_t *num_rm, magma_d_matrix *LU, magma_d_matrix *LU_new, magma_index_t *rm_loc, magma_queue_t queue) |
This routine removes matrix entries from the structure that are smaller than the threshold. | |
magma_int_t | magma_dparilut_count (magma_d_matrix L, magma_int_t *num, magma_queue_t queue) |
This is a helper routine counting elements in a matrix in unordered Magma_CSRLIST format. | |
magma_int_t | magma_dparilut_randlist (magma_d_matrix *LU, magma_queue_t queue) |
magma_int_t | magma_dparilut_select_candidates_L (magma_int_t *num_rm, magma_index_t *rm_loc, magma_d_matrix *L_new, magma_queue_t queue) |
Screens the new candidates for multiple elements in the same row. | |
magma_int_t | magma_dparilut_select_candidates_U (magma_int_t *num_rm, magma_index_t *rm_loc, magma_d_matrix *L_new, magma_queue_t queue) |
Screens the new candidates for multiple elements in the same row. | |
magma_int_t | magma_dparilut_preselect (magma_int_t order, magma_d_matrix *A, magma_d_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_dpreselect_gpu (magma_int_t order, magma_d_matrix *A, magma_d_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_dsampleselect (magma_int_t total_size, magma_int_t subset_size, double *val, double *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_dsampleselect_approx (magma_int_t total_size, magma_int_t subset_size, double *val, double *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_dsampleselect_nodp (magma_int_t total_size, magma_int_t subset_size, double *val, double *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_dsampleselect_approx_nodp (magma_int_t total_size, magma_int_t subset_size, double *val, double *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_dmprepare_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_d_matrix L, magma_d_matrix LC, magma_index_t *sizes, magma_index_t *locations, double *trisystems, double *rhs, magma_queue_t queue) |
Takes a sparse matrix and generates an array containing the sizes of the different systems an array containing the indices with the locations in the sparse matrix where the data comes from and goes back to an array containing all the sparse triangular systems. | |
magma_int_t | magma_dmtrisolve_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_d_matrix L, magma_d_matrix LC, magma_index_t *sizes, magma_index_t *locations, double *trisystems, double *rhs, magma_queue_t queue) |
Does all triangular solves. | |
magma_int_t | magma_dmbackinsert_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_d_matrix *M, magma_index_t *sizes, magma_index_t *locations, double *trisystems, double *rhs, magma_queue_t queue) |
Inserts the values into the preconditioner matrix. | |
magma_int_t | magma_dmprepare_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_d_matrix L, magma_d_matrix LC, magma_index_t *sizes, magma_index_t *locations, double *trisystems, double *rhs, magma_queue_t queue) |
magma_int_t | magma_dmtrisolve_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_d_matrix L, magma_d_matrix LC, magma_index_t *sizes, magma_index_t *locations, double *trisystems, double *rhs, magma_queue_t queue) |
Does all triangular solves. | |
magma_int_t | magma_dmbackinsert_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_d_matrix *M, magma_index_t *sizes, magma_index_t *locations, double *trisystems, double *rhs, magma_queue_t queue) |
magma_int_t | magma_diluisaisetup_lower (magma_d_matrix L, magma_d_matrix S, magma_d_matrix *ISAIL, magma_queue_t queue) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves. | |
magma_int_t | magma_diluisaisetup_upper (magma_d_matrix U, magma_d_matrix S, magma_d_matrix *ISAIU, magma_queue_t queue) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves. | |
magma_int_t | magma_dicisaisetup (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_disai_l (magma_d_matrix b, magma_d_matrix *x, magma_d_preconditioner *precond, magma_queue_t queue) |
Left-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_disai_r (magma_d_matrix b, magma_d_matrix *x, magma_d_preconditioner *precond, magma_queue_t queue) |
Right-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_disai_l_t (magma_d_matrix b, magma_d_matrix *x, magma_d_preconditioner *precond, magma_queue_t queue) |
Left-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_disai_r_t (magma_d_matrix b, magma_d_matrix *x, magma_d_preconditioner *precond, magma_queue_t queue) |
Right-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_dmiluisai_sizecheck (magma_d_matrix A, magma_index_t batchsize, magma_index_t *maxsize, magma_queue_t queue) |
magma_int_t | magma_dgeisai_maxblock (magma_d_matrix L, magma_d_matrix *MT, magma_queue_t queue) |
This routine maximizes the pattern for the ISAI preconditioner. | |
magma_int_t | magma_disai_generator_regs (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_d_matrix L, magma_d_matrix *M, magma_queue_t queue) |
This routine is designet to combine all kernels into one. | |
magma_int_t | magma_dmsupernodal (magma_int_t *max_bs, magma_d_matrix A, magma_d_matrix *S, magma_queue_t queue) |
Generates a block-diagonal sparsity pattern with block-size bs. | |
magma_int_t | magma_dmvarsizeblockstruct (magma_int_t n, magma_int_t *bs, magma_int_t bsl, magma_uplo_t uplotype, magma_d_matrix *A, magma_queue_t queue) |
Generates a block-diagonal sparsity pattern with variable block-size. | |
magma_int_t | magma_dtfqmr_unrolled (magma_d_matrix A, magma_d_matrix b, magma_d_matrix *x, magma_d_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a real matrix A. | |
magma_int_t | magma_dbicgstab_merge2 (magma_d_matrix A, magma_d_matrix b, magma_d_matrix *x, magma_d_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a general matrix. | |
magma_int_t | magma_dbicgstab_merge3 (magma_d_matrix A, magma_d_matrix b, magma_d_matrix *x, magma_d_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a general matrix. | |
magma_int_t | magma_djacobidomainoverlap (magma_d_matrix A, magma_d_matrix b, magma_d_matrix *x, magma_d_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a real symmetric N-by-N positive definite matrix A. | |
magma_int_t | magma_dbaiter (magma_d_matrix A, magma_d_matrix b, magma_d_matrix *x, magma_d_solver_par *solver_par, magma_d_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU. | |
magma_int_t | magma_dbaiter_overlap (magma_d_matrix A, magma_d_matrix b, magma_d_matrix *x, magma_d_solver_par *solver_par, magma_d_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU. | |
magma_int_t | magma_dftjacobicontractions (magma_d_matrix xkm2, magma_d_matrix xkm1, magma_d_matrix xk, magma_d_matrix *z, magma_d_matrix *c, magma_queue_t queue) |
Computes the contraction coefficients c_i: | |
magma_int_t | magma_dftjacobiupdatecheck (double delta, magma_d_matrix *xold, magma_d_matrix *xnew, magma_d_matrix *zprev, magma_d_matrix c, magma_int_t *flag_t, magma_int_t *flag_fp, magma_queue_t queue) |
Checks the Jacobi updates accorting to the condition in the ScaLA'15 paper. | |
magma_int_t | magma_diterref (magma_d_matrix A, magma_d_matrix b, magma_d_matrix *x, magma_d_solver_par *solver_par, magma_d_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a real symmetric N-by-N positive definite matrix A. | |
magma_int_t | magma_djacobiiter_sys (magma_d_matrix A, magma_d_matrix b, magma_d_matrix d, magma_d_matrix t, magma_d_matrix *x, magma_d_solver_par *solver_par, magma_queue_t queue) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. | |
magma_int_t | magma_dftjacobi (magma_d_matrix A, magma_d_matrix b, magma_d_matrix *x, magma_d_solver_par *solver_par, magma_queue_t queue) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. | |
magma_int_t | magma_dilut_saad (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_dilut_saad_apply (magma_d_matrix b, magma_d_matrix *x, magma_d_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_dcustomilusetup (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Reads in an Incomplete LU preconditioner. | |
magma_int_t | magma_dcustomicsetup (magma_d_matrix A, magma_d_matrix b, magma_d_preconditioner *precond, magma_queue_t queue) |
Reads in an Incomplete Cholesky preconditioner. | |
magma_int_t | magma_dbajac_csr (magma_int_t localiters, magma_d_matrix D, magma_d_matrix R, magma_d_matrix b, magma_d_matrix *x, magma_queue_t queue) |
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block. | |
magma_int_t | magma_dbajac_csr_overlap (magma_int_t localiters, magma_int_t matrices, magma_int_t overlap, magma_d_matrix *D, magma_d_matrix *R, magma_d_matrix b, magma_d_matrix *x, magma_queue_t queue) |
This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block. | |
magma_int_t | magma_dmlumerge (magma_d_matrix L, magma_d_matrix U, magma_d_matrix *A, magma_queue_t queue) |
Takes an strictly lower triangular matrix L and an upper triangular matrix U and merges them into a matrix A containing the upper and lower triangular parts. | |
magma_int_t | magma_dgeaxpy (double alpha, magma_d_matrix X, double beta, magma_d_matrix *Y, magma_queue_t queue) |
This routine computes Y = alpha * X + beta * Y on the GPU. | |
magma_int_t | magma_dgecsrreimsplit (magma_d_matrix A, magma_d_matrix *ReA, magma_d_matrix *ImA, magma_queue_t queue) |
This routine takes an input matrix A in CSR format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A. | |
magma_int_t | magma_dgedensereimsplit (magma_d_matrix A, magma_d_matrix *ReA, magma_d_matrix *ImA, magma_queue_t queue) |
This routine takes an input matrix A in DENSE format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A. | |
magma_int_t | magma_dgecsr5mv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t p, double alpha, magma_int_t sigma, magma_int_t bit_y_offset, magma_int_t bit_scansum_offset, magma_int_t num_packet, magmaUIndex_ptr dtile_ptr, magmaUIndex_ptr dtile_desc, magmaIndex_ptr dtile_desc_offset_ptr, magmaIndex_ptr dtile_desc_offset, magmaDouble_ptr dcalibrator, magma_int_t tail_tile_start, magmaDouble_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t | magma_dcopyscale (magma_int_t n, magma_int_t k, magmaDouble_ptr dr, magmaDouble_ptr dv, magmaDouble_ptr dskp, magma_queue_t queue) |
Computes the correction term of the pipelined GMRES according to P. | |
magma_int_t | magma_dnrm2scale (magma_int_t m, magmaDouble_ptr dr, magma_int_t lddr, double *drnorm, magma_queue_t queue) |
magma_int_t | magma_djacobispmvupdate_bw (magma_int_t maxiter, magma_d_matrix A, magma_d_matrix t, magma_d_matrix b, magma_d_matrix d, magma_d_matrix *x, magma_queue_t queue) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d. | |
magma_int_t | magma_djacobispmvupdateselect (magma_int_t maxiter, magma_int_t num_updates, magma_index_t *indices, magma_d_matrix A, magma_d_matrix t, magma_d_matrix b, magma_d_matrix d, magma_d_matrix tmp, magma_d_matrix *x, magma_queue_t queue) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d. | |
magma_int_t | magma_dmergeblockkrylov (magma_int_t num_rows, magma_int_t num_cols, magmaDouble_ptr alpha, magmaDouble_ptr p, magmaDouble_ptr x, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_dbicgmerge1 (magma_int_t n, magmaDouble_ptr dskp, magmaDouble_ptr dv, magmaDouble_ptr dr, magmaDouble_ptr dp, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_dbicgmerge2 (magma_int_t n, magmaDouble_ptr dskp, magmaDouble_ptr dr, magmaDouble_ptr dv, magmaDouble_ptr ds, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_dbicgmerge3 (magma_int_t n, magmaDouble_ptr dskp, magmaDouble_ptr dp, magmaDouble_ptr ds, magmaDouble_ptr dt, magmaDouble_ptr dx, magmaDouble_ptr dr, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_dbicgmerge4 (magma_int_t type, magmaDouble_ptr dskp, magma_queue_t queue) |
Performs some parameter operations for the BiCGSTAB with scalars on GPU. | |
magma_int_t | magma_dbicgmerge_spmv1 (magma_d_matrix A, magmaDouble_ptr dd1, magmaDouble_ptr dd2, magmaDouble_ptr dp, magmaDouble_ptr dr, magmaDouble_ptr dv, magmaDouble_ptr dskp, magma_queue_t queue) |
Merges the first SpmV using CSR with the dot product and the computation of alpha. | |
magma_int_t | magma_dbicgmerge_spmv2 (magma_d_matrix A, magmaDouble_ptr dd1, magmaDouble_ptr dd2, magmaDouble_ptr ds, magmaDouble_ptr dt, magmaDouble_ptr dskp, magma_queue_t queue) |
Merges the second SpmV using CSR with the dot product and the computation of omega. | |
magma_int_t | magma_dbicgmerge_xrbeta (magma_int_t n, magmaDouble_ptr dd1, magmaDouble_ptr dd2, magmaDouble_ptr drr, magmaDouble_ptr dr, magmaDouble_ptr dp, magmaDouble_ptr ds, magmaDouble_ptr dt, magmaDouble_ptr dx, magmaDouble_ptr dskp, magma_queue_t queue) |
Merges the second SpmV using CSR with the dot product and the computation of omega. | |
magma_int_t | magma_dbcsrswp (magma_int_t n, magma_int_t size_b, magma_int_t *ipiv, magmaDouble_ptr dx, magma_queue_t queue) |
magma_int_t | magma_dbcsrtrsv (magma_uplo_t uplo, magma_int_t r_blocks, magma_int_t c_blocks, magma_int_t size_b, magmaDouble_ptr dA, magma_index_t *blockinfo, magmaDouble_ptr dx, magma_queue_t queue) |
magma_int_t | magma_dbcsrvalcpy (magma_int_t size_b, magma_int_t num_blocks, magma_int_t num_zero_blocks, magmaDouble_ptr *dAval, magmaDouble_ptr *dBval, magmaDouble_ptr *dBval2, magma_queue_t queue) |
magma_int_t | magma_dbcsrluegemm (magma_int_t size_b, magma_int_t num_block_rows, magma_int_t kblocks, magmaDouble_ptr *dA, magmaDouble_ptr *dB, magmaDouble_ptr *dC, magma_queue_t queue) |
magma_int_t | magma_dbcsrlupivloc (magma_int_t size_b, magma_int_t kblocks, magmaDouble_ptr *dA, magma_int_t *ipiv, magma_queue_t queue) |
magma_int_t | magma_dbcsrblockinfo5 (magma_int_t lustep, magma_int_t num_blocks, magma_int_t c_blocks, magma_int_t size_b, magma_index_t *blockinfo, magmaDouble_ptr dval, magmaDouble_ptr *AII, magma_queue_t queue) |
magma_int_t | magma_dthrsholdselect (magma_int_t sampling, magma_int_t total_size, magma_int_t subset_size, double *val, double *thrs, magma_queue_t queue) |
magma_int_t | magma_vector_dlag2s (magma_d_matrix x, magma_s_matrix *y, magma_queue_t queue) |
magma_int_t | magma_sparse_matrix_dlag2s (magma_d_matrix A, magma_s_matrix *B, magma_queue_t queue) |
magma_int_t | magma_vector_slag2d (magma_s_matrix x, magma_d_matrix *y, magma_queue_t queue) |
magma_int_t | magma_sparse_matrix_slag2d (magma_s_matrix A, magma_d_matrix *B, magma_queue_t queue) |
void | magmablas_dlag2s_sparse (magma_int_t M, magma_int_t N, magmaDouble_const_ptr dA, magma_int_t lda, magmaFloat_ptr dSA, magma_int_t ldsa, magma_queue_t queue, magma_int_t *info) |
void | magmablas_slag2d_sparse (magma_int_t M, magma_int_t N, magmaFloat_const_ptr dSA, magma_int_t ldsa, magmaDouble_ptr dA, magma_int_t lda, magma_queue_t queue, magma_int_t *info) |
void | magma_dlag2s_CSR_DENSE (magma_d_matrix A, magma_s_matrix *B, magma_queue_t queue) |
void | magma_dlag2s_CSR_DENSE_alloc (magma_d_matrix A, magma_s_matrix *B, magma_queue_t queue) |
void | magma_dlag2s_CSR_DENSE_convert (magma_d_matrix A, magma_s_matrix *B, magma_queue_t queue) |
magma_int_t | magma_dsgecsrmv_mixed_prec (magma_trans_t transA, magma_int_t m, magma_int_t n, double alpha, magmaDouble_ptr ddiagval, magmaFloat_ptr doffdiagval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDouble_ptr dx, double beta, magmaDouble_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
int | mm_write_mtx_crd (char fname[], magma_index_t M, magma_index_t N, magma_index_t nz, magma_index_t I[], magma_index_t J[], double val[], MM_typecode matcode) |
int | mm_read_mtx_crd_data (FILE *f, magma_index_t M, magma_index_t N, magma_index_t nz, magma_index_t I[], magma_index_t J[], double val[], MM_typecode matcode) |
int | mm_read_mtx_crd_entry (FILE *f, magma_index_t *I, magma_index_t *J, double *real, double *img, MM_typecode matcode) |
int | mm_read_unsymmetric_sparse (const char *fname, magma_index_t *M_, magma_index_t *N_, magma_index_t *nz_, double **val_, magma_index_t **I_, magma_index_t **J_) |
magma_int_t | magma_sorderstatistics (float *val, magma_int_t length, magma_int_t k, magma_int_t r, float *element, magma_queue_t queue) |
Identifies the kth smallest/largest element in an array. | |
magma_int_t | magma_sorderstatistics_inc (float *val, magma_int_t length, magma_int_t k, magma_int_t inc, magma_int_t r, float *element, magma_queue_t queue) |
Approximates the k-th smallest element in an array by using order-statistics with step-size inc. | |
magma_int_t | magma_smorderstatistics (float *val, magma_index_t *col, magma_index_t *row, magma_int_t length, magma_int_t k, magma_int_t r, float *element, magma_queue_t queue) |
Identifies the kth smallest/largest element in an array and reorders such that these elements come to the front. | |
magma_int_t | magma_spartition (float *a, magma_int_t size, magma_int_t pivot, magma_queue_t queue) |
magma_int_t | magma_smedian5 (float *a, magma_queue_t queue) |
magma_int_t | magma_sselect (float *a, magma_int_t size, magma_int_t k, magma_queue_t queue) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm. | |
magma_int_t | magma_sselectrandom (float *a, magma_int_t size, magma_int_t k, magma_queue_t queue) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm. | |
magma_int_t | magma_sdomainoverlap (magma_index_t num_rows, magma_int_t *num_indices, magma_index_t *rowptr, magma_index_t *colidx, magma_index_t *x, magma_queue_t queue) |
Generates the update list. | |
magma_int_t | magma_svspread (magma_s_matrix *x, const char *filename, magma_queue_t queue) |
Reads in a sparse vector-block stored in COO format. | |
magma_int_t | magma_sdiameter (magma_s_matrix *A, magma_queue_t queue) |
Computes the diameter of a sparse matrix and stores the value in diameter. | |
magma_int_t | magma_sparilusetup (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Prepares the ILU preconditioner via the iterative ILU iteration. | |
magma_int_t | magma_sparilu_gpu (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Generates an ILU(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_sparilu_cpu (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Generates an ILU(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_sparic_gpu (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Generates an IC(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_sparic_cpu (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Generates an IC(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_sparicsetup (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Prepares the IC preconditioner via the iterative IC iteration. | |
magma_int_t | magma_sparicupdate (magma_s_matrix A, magma_s_preconditioner *precond, magma_int_t updates, magma_queue_t queue) |
Updates an existing preconditioner via additional iterative IC sweeps for previous factorization initial guess (PFIG). | |
magma_int_t | magma_sapplyiteric_l (magma_s_matrix b, magma_s_matrix *x, magma_s_preconditioner *precond, magma_queue_t queue) |
Performs the left triangular solves using the IC preconditioner via Jacobi. | |
magma_int_t | magma_sapplyiteric_r (magma_s_matrix b, magma_s_matrix *x, magma_s_preconditioner *precond, magma_queue_t queue) |
Performs the right triangular solves using the IC preconditioner via Jacobi. | |
magma_int_t | magma_sparilu_csr (magma_s_matrix A, magma_s_matrix L, magma_s_matrix U, magma_queue_t queue) |
This routine iteratively computes an incomplete LU factorization. | |
magma_int_t | magma_spariluupdate (magma_s_matrix A, magma_s_preconditioner *precond, magma_int_t updates, magma_queue_t queue) |
Updates an existing preconditioner via additional iterative ILU sweeps for previous factorization initial guess (PFIG). | |
magma_int_t | magma_sparic_csr (magma_s_matrix A, magma_s_matrix A_CSR, magma_queue_t queue) |
This routine iteratively computes an incomplete LU factorization. | |
magma_int_t | magma_snonlinres (magma_s_matrix A, magma_s_matrix L, magma_s_matrix U, magma_s_matrix *LU, real_Double_t *res, magma_queue_t queue) |
Computes the nonlinear residual A - LU and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_silures (magma_s_matrix A, magma_s_matrix L, magma_s_matrix U, magma_s_matrix *LU, real_Double_t *res, real_Double_t *nonlinres, magma_queue_t queue) |
Computes the ILU residual A - LU and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_sicres (magma_s_matrix A, magma_s_matrix C, magma_s_matrix CT, magma_s_matrix *LU, real_Double_t *res, real_Double_t *nonlinres, magma_queue_t queue) |
Computes the IC residual A - CC^T and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_sinitguess (magma_s_matrix A, magma_s_matrix *L, magma_s_matrix *U, magma_queue_t queue) |
Computes an initial guess for the ParILU/ParIC. | |
magma_int_t | magma_sinitrecursiveLU (magma_s_matrix A, magma_s_matrix *B, magma_queue_t queue) |
Using the iterative approach of computing ILU factorizations with increasing fill-in, it takes the input matrix A, containing the approximate factors, ( L and U as well ) computes a matrix with one higher level of fill-in, inserts the original approximation as initial guess, and provides the factors L and U also filled with the scaled initial guess. | |
magma_int_t | magma_smLdiagadd (magma_s_matrix *L, magma_queue_t queue) |
Checks for a lower triangular matrix whether it is strictly lower triangular and in the negative case adds a unit diagonal. | |
magma_int_t | magma_smatrix_cup (magma_s_matrix A, magma_s_matrix B, magma_s_matrix *U, magma_queue_t queue) |
Generates a matrix \(U = A \cup B\). | |
magma_int_t | magma_smatrix_cup_gpu (magma_s_matrix A, magma_s_matrix B, magma_s_matrix *U, magma_queue_t queue) |
Generates a matrix \(U = A \cup B\). | |
magma_int_t | magma_smatrix_cap (magma_s_matrix A, magma_s_matrix B, magma_s_matrix *U, magma_queue_t queue) |
Generates a matrix with entries being in both matrices: \(U = A \cap B\). | |
magma_int_t | magma_smatrix_negcap (magma_s_matrix A, magma_s_matrix B, magma_s_matrix *U, magma_queue_t queue) |
Generates a list of matrix entries being part of A but not of B. | |
magma_int_t | magma_smatrix_tril_negcap (magma_s_matrix A, magma_s_matrix B, magma_s_matrix *U, magma_queue_t queue) |
Generates a list of matrix entries being part of tril(A) but not of B. | |
magma_int_t | magma_smatrix_triu_negcap (magma_s_matrix A, magma_s_matrix B, magma_s_matrix *U, magma_queue_t queue) |
Generates a matrix with entries being part of triu(A) but not of B. | |
magma_int_t | magma_smatrix_abssum (magma_s_matrix A, float *sum, magma_queue_t queue) |
Computes the sum of the absolute values in a matrix. | |
magma_int_t | magma_sparilut_thrsrm (magma_int_t order, magma_s_matrix *A, float *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing. | |
magma_int_t | magma_sparilut_thrsrm_semilinked (magma_s_matrix *U, magma_s_matrix *UT, float *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller thrs from the matrix. | |
magma_int_t | magma_sparilut_rmselected (magma_s_matrix R, magma_s_matrix *A, magma_queue_t queue) |
Removes a selected list of elements from the matrix. | |
magma_int_t | magma_sparilut_selectoneperrow (magma_int_t order, magma_s_matrix *A, magma_s_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_sparilut_selecttwoperrow (magma_int_t order, magma_s_matrix *A, magma_s_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_sparilut_selectoneperrowthrs_lower (magma_s_matrix L, magma_s_matrix U, magma_s_matrix *A, float rtol, magma_s_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_sparilut_selectoneperrowthrs_upper (magma_s_matrix L, magma_s_matrix U, magma_s_matrix *A, float rtol, magma_s_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_sparilut_selectonepercol (magma_int_t order, magma_s_matrix *A, magma_s_matrix *oneA, magma_queue_t queue) |
magma_int_t | magma_sparilut_transpose_select_one (magma_s_matrix A, magma_s_matrix *B, magma_queue_t queue) |
This is a special routine with very limited scope. | |
magma_int_t | magma_sparilut_insert_LU (magma_int_t num_rm, magma_index_t *rm_loc, magma_index_t *rm_loc2, magma_s_matrix *LU_new, magma_s_matrix *L, magma_s_matrix *U, magma_queue_t queue) |
magma_int_t | magma_sparilut_set_thrs (magma_int_t num_rm, magma_s_matrix *LU, magma_int_t order, float *thrs, magma_queue_t queue) |
magma_int_t | magma_sparilut_set_approx_thrs (magma_int_t num_rm, magma_s_matrix *LU, magma_int_t order, float *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_sparilut_set_thrs_randomselect (magma_int_t num_rm, magma_s_matrix *LU, magma_int_t order, float *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_sparilut_set_thrs_randomselect_approx (magma_int_t num_rm, magma_s_matrix *LU, magma_int_t order, float *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_sparilut_set_thrs_randomselect_factors (magma_int_t num_rm, magma_s_matrix *L, magma_s_matrix *U, magma_int_t order, float *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_sparilut_set_exact_thrs (magma_int_t num_rm, magma_s_matrix *LU, magma_int_t order, float *thrs, magma_queue_t queue) |
This routine provides the exact threshold for removing num_rm elements. | |
magma_int_t | magma_sparilut_set_approx_thrs_inc (magma_int_t num_rm, magma_s_matrix *LU, magma_int_t order, float *thrs, magma_queue_t queue) |
This routine provides the exact threshold for removing num_rm elements. | |
magma_int_t | magma_sparilut_LU_approx_thrs (magma_int_t num_rm, magma_s_matrix *L, magma_s_matrix *U, magma_int_t order, float *thrs, magma_queue_t queue) |
magma_int_t | magma_sparilut_reorder (magma_s_matrix *LU, magma_queue_t queue) |
This routine reorders the matrix (inplace) for easier access. | |
magma_int_t | magma_sparict_sweep (magma_s_matrix *A, magma_s_matrix *LU, magma_queue_t queue) |
magma_int_t | magma_sparilut_zero (magma_s_matrix *A, magma_queue_t queue) |
magma_int_t | magma_sparilu_sweep (magma_s_matrix A, magma_s_matrix *L, magma_s_matrix *U, magma_queue_t queue) |
This function does one asynchronous ParILU sweep. | |
magma_int_t | magma_sparilu_sweep_sync (magma_s_matrix A, magma_s_matrix *L, magma_s_matrix *U, magma_queue_t queue) |
This function does one synchronized ParILU sweep. | |
magma_int_t | magma_sparic_sweep (magma_s_matrix A, magma_s_matrix *L, magma_queue_t queue) |
This function does one asynchronous ParILU sweep (symmetric case). | |
magma_int_t | magma_sparic_sweep_sync (magma_s_matrix A, magma_s_matrix *L, magma_queue_t queue) |
This function does one synchronized ParILU sweep (symmetric case). | |
magma_int_t | magma_sparict_sweep_sync (magma_s_matrix *A, magma_s_matrix *L, magma_queue_t queue) |
This function does one synchronized ParILU sweep. | |
magma_int_t | magma_sparilut_sweep_sync (magma_s_matrix *A, magma_s_matrix *L, magma_s_matrix *U, magma_queue_t queue) |
This function does an ParILUT sweep. | |
magma_int_t | magma_sparilut_sweep_gpu (magma_s_matrix *A, magma_s_matrix *L, magma_s_matrix *U, magma_queue_t queue) |
This function does an ParILUT sweep. | |
magma_int_t | magma_sparilut_residuals_gpu (magma_s_matrix A, magma_s_matrix L, magma_s_matrix U, magma_s_matrix *R, magma_queue_t queue) |
This function computes the ILU residual in the locations included in the sparsity pattern of R. | |
magma_int_t | magma_sthrsholdrm_gpu (magma_int_t order, magma_s_matrix *A, float *thrs, magma_queue_t queue) |
magma_int_t | magma_sget_row_ptr (const magma_int_t num_rows, magma_int_t *nnz, const magma_index_t *rowidx, magma_index_t *rowptr, magma_queue_t queue) |
magma_int_t | magma_sparilut_align_residuals (magma_s_matrix L, magma_s_matrix U, magma_s_matrix *Lnew, magma_s_matrix *Unew, magma_queue_t queue) |
This function scales the residuals of a lower triangular factor L with the diagonal of U. | |
magma_int_t | magma_sparilut_preselect_scale (magma_s_matrix *L, magma_s_matrix *oneL, magma_s_matrix *U, magma_s_matrix *oneU, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_sparilut_thrsrm_U (magma_int_t order, magma_s_matrix L, magma_s_matrix *A, float *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing. | |
magma_int_t | magma_sparilut_residuals (magma_s_matrix A, magma_s_matrix L, magma_s_matrix U, magma_s_matrix *L_new, magma_queue_t queue) |
This function computes the ILU residual in the locations included in the sparsity pattern of R. | |
magma_int_t | magma_sparilut_residuals_transpose (magma_s_matrix A, magma_s_matrix L, magma_s_matrix U, magma_s_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_sparilut_residuals_semilinked (magma_s_matrix A, magma_s_matrix L, magma_s_matrix US, magma_s_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_sparilut_sweep_semilinked (magma_s_matrix *A, magma_s_matrix *L, magma_s_matrix *US, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_sparilut_sweep_list (magma_s_matrix *A, magma_s_matrix *L, magma_s_matrix *U, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_sparilut_residuals_list (magma_s_matrix A, magma_s_matrix L, magma_s_matrix U, magma_s_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_sparilut_sweep_linkedlist (magma_s_matrix *A, magma_s_matrix *L, magma_s_matrix *U, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_sparilut_residuals_linkedlist (magma_s_matrix A, magma_s_matrix L, magma_s_matrix U, magma_s_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_sparilut_colmajor (magma_s_matrix A, magma_s_matrix *AC, magma_queue_t queue) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix. | |
magma_int_t | magma_sparilut_colmajorup (magma_s_matrix A, magma_s_matrix *AC, magma_queue_t queue) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix. | |
magma_int_t | magma_sparict (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Prepares the iterative threshold Incomplete Cholesky preconditioner. | |
magma_int_t | magma_sparict_cpu (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold Cholesky preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_sparilut (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Prepares the iterative threshold Incomplete LU preconditioner. | |
magma_int_t | magma_sparilut_cpu (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_sparilut_gpu (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_sparilut_gpu_nodp (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_sparilut_insert (magma_int_t *num_rmL, magma_int_t *num_rmU, magma_index_t *rm_locL, magma_index_t *rm_locU, magma_s_matrix *L_new, magma_s_matrix *U_new, magma_s_matrix *L, magma_s_matrix *U, magma_s_matrix *UR, magma_queue_t queue) |
Inserts for the iterative dynamic ILU an new element in the (empty) place. | |
magma_int_t | magma_sparilut_create_collinkedlist (magma_s_matrix A, magma_s_matrix *B, magma_queue_t queue) |
For the matrix U in CSR (row-major) this creates B containing a row-ptr to the columns and a linked list for the elements. | |
magma_int_t | magma_sparilut_candidates (magma_s_matrix L0, magma_s_matrix U0, magma_s_matrix L, magma_s_matrix U, magma_s_matrix *L_new, magma_s_matrix *U_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_sparilut_candidates_gpu (magma_s_matrix L0, magma_s_matrix U0, magma_s_matrix L, magma_s_matrix U, magma_s_matrix *L_new, magma_s_matrix *U_new, magma_queue_t queue) |
This function identifies the locations with a potential nonzero ILU residual R = A - L*U where L and U are the current incomplete factors. | |
magma_int_t | magma_sparict_candidates (magma_s_matrix L0, magma_s_matrix L, magma_s_matrix LT, magma_s_matrix *L_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_sparilut_candidates_semilinked (magma_s_matrix L0, magma_s_matrix U0, magma_s_matrix L, magma_s_matrix U, magma_s_matrix UT, magma_s_matrix *L_new, magma_s_matrix *U_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_sparilut_candidates_linkedlist (magma_s_matrix L0, magma_s_matrix U0, magma_s_matrix L, magma_s_matrix U, magma_s_matrix UR, magma_s_matrix *L_new, magma_s_matrix *U_new, magma_queue_t queue) |
magma_int_t | magma_sparilut_rm_thrs (float *thrs, magma_int_t *num_rm, magma_s_matrix *LU, magma_s_matrix *LU_new, magma_index_t *rm_loc, magma_queue_t queue) |
This routine removes matrix entries from the structure that are smaller than the threshold. | |
magma_int_t | magma_sparilut_count (magma_s_matrix L, magma_int_t *num, magma_queue_t queue) |
This is a helper routine counting elements in a matrix in unordered Magma_CSRLIST format. | |
magma_int_t | magma_sparilut_randlist (magma_s_matrix *LU, magma_queue_t queue) |
magma_int_t | magma_sparilut_select_candidates_L (magma_int_t *num_rm, magma_index_t *rm_loc, magma_s_matrix *L_new, magma_queue_t queue) |
Screens the new candidates for multiple elements in the same row. | |
magma_int_t | magma_sparilut_select_candidates_U (magma_int_t *num_rm, magma_index_t *rm_loc, magma_s_matrix *L_new, magma_queue_t queue) |
Screens the new candidates for multiple elements in the same row. | |
magma_int_t | magma_sparilut_preselect (magma_int_t order, magma_s_matrix *A, magma_s_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_spreselect_gpu (magma_int_t order, magma_s_matrix *A, magma_s_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_ssampleselect (magma_int_t total_size, magma_int_t subset_size, float *val, float *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_ssampleselect_approx (magma_int_t total_size, magma_int_t subset_size, float *val, float *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_ssampleselect_nodp (magma_int_t total_size, magma_int_t subset_size, float *val, float *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_ssampleselect_approx_nodp (magma_int_t total_size, magma_int_t subset_size, float *val, float *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_smprepare_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_s_matrix L, magma_s_matrix LC, magma_index_t *sizes, magma_index_t *locations, float *trisystems, float *rhs, magma_queue_t queue) |
Takes a sparse matrix and generates an array containing the sizes of the different systems an array containing the indices with the locations in the sparse matrix where the data comes from and goes back to an array containing all the sparse triangular systems. | |
magma_int_t | magma_smtrisolve_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_s_matrix L, magma_s_matrix LC, magma_index_t *sizes, magma_index_t *locations, float *trisystems, float *rhs, magma_queue_t queue) |
Does all triangular solves. | |
magma_int_t | magma_smbackinsert_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_s_matrix *M, magma_index_t *sizes, magma_index_t *locations, float *trisystems, float *rhs, magma_queue_t queue) |
Inserts the values into the preconditioner matrix. | |
magma_int_t | magma_smprepare_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_s_matrix L, magma_s_matrix LC, magma_index_t *sizes, magma_index_t *locations, float *trisystems, float *rhs, magma_queue_t queue) |
magma_int_t | magma_smtrisolve_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_s_matrix L, magma_s_matrix LC, magma_index_t *sizes, magma_index_t *locations, float *trisystems, float *rhs, magma_queue_t queue) |
Does all triangular solves. | |
magma_int_t | magma_smbackinsert_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_s_matrix *M, magma_index_t *sizes, magma_index_t *locations, float *trisystems, float *rhs, magma_queue_t queue) |
magma_int_t | magma_siluisaisetup_lower (magma_s_matrix L, magma_s_matrix S, magma_s_matrix *ISAIL, magma_queue_t queue) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves. | |
magma_int_t | magma_siluisaisetup_upper (magma_s_matrix U, magma_s_matrix S, magma_s_matrix *ISAIU, magma_queue_t queue) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves. | |
magma_int_t | magma_sicisaisetup (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_sisai_l (magma_s_matrix b, magma_s_matrix *x, magma_s_preconditioner *precond, magma_queue_t queue) |
Left-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_sisai_r (magma_s_matrix b, magma_s_matrix *x, magma_s_preconditioner *precond, magma_queue_t queue) |
Right-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_sisai_l_t (magma_s_matrix b, magma_s_matrix *x, magma_s_preconditioner *precond, magma_queue_t queue) |
Left-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_sisai_r_t (magma_s_matrix b, magma_s_matrix *x, magma_s_preconditioner *precond, magma_queue_t queue) |
Right-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_smiluisai_sizecheck (magma_s_matrix A, magma_index_t batchsize, magma_index_t *maxsize, magma_queue_t queue) |
magma_int_t | magma_sgeisai_maxblock (magma_s_matrix L, magma_s_matrix *MT, magma_queue_t queue) |
This routine maximizes the pattern for the ISAI preconditioner. | |
magma_int_t | magma_sisai_generator_regs (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_s_matrix L, magma_s_matrix *M, magma_queue_t queue) |
This routine is designet to combine all kernels into one. | |
magma_int_t | magma_smsupernodal (magma_int_t *max_bs, magma_s_matrix A, magma_s_matrix *S, magma_queue_t queue) |
Generates a block-diagonal sparsity pattern with block-size bs. | |
magma_int_t | magma_smvarsizeblockstruct (magma_int_t n, magma_int_t *bs, magma_int_t bsl, magma_uplo_t uplotype, magma_s_matrix *A, magma_queue_t queue) |
Generates a block-diagonal sparsity pattern with variable block-size. | |
magma_int_t | magma_stfqmr_unrolled (magma_s_matrix A, magma_s_matrix b, magma_s_matrix *x, magma_s_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a real matrix A. | |
magma_int_t | magma_sbicgstab_merge2 (magma_s_matrix A, magma_s_matrix b, magma_s_matrix *x, magma_s_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a general matrix. | |
magma_int_t | magma_sbicgstab_merge3 (magma_s_matrix A, magma_s_matrix b, magma_s_matrix *x, magma_s_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a general matrix. | |
magma_int_t | magma_sjacobidomainoverlap (magma_s_matrix A, magma_s_matrix b, magma_s_matrix *x, magma_s_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a real symmetric N-by-N positive definite matrix A. | |
magma_int_t | magma_sbaiter (magma_s_matrix A, magma_s_matrix b, magma_s_matrix *x, magma_s_solver_par *solver_par, magma_s_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU. | |
magma_int_t | magma_sbaiter_overlap (magma_s_matrix A, magma_s_matrix b, magma_s_matrix *x, magma_s_solver_par *solver_par, magma_s_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU. | |
magma_int_t | magma_sftjacobicontractions (magma_s_matrix xkm2, magma_s_matrix xkm1, magma_s_matrix xk, magma_s_matrix *z, magma_s_matrix *c, magma_queue_t queue) |
Computes the contraction coefficients c_i: | |
magma_int_t | magma_sftjacobiupdatecheck (float delta, magma_s_matrix *xold, magma_s_matrix *xnew, magma_s_matrix *zprev, magma_s_matrix c, magma_int_t *flag_t, magma_int_t *flag_fp, magma_queue_t queue) |
Checks the Jacobi updates accorting to the condition in the ScaLA'15 paper. | |
magma_int_t | magma_siterref (magma_s_matrix A, magma_s_matrix b, magma_s_matrix *x, magma_s_solver_par *solver_par, magma_s_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a real symmetric N-by-N positive definite matrix A. | |
magma_int_t | magma_sjacobiiter_sys (magma_s_matrix A, magma_s_matrix b, magma_s_matrix d, magma_s_matrix t, magma_s_matrix *x, magma_s_solver_par *solver_par, magma_queue_t queue) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. | |
magma_int_t | magma_sftjacobi (magma_s_matrix A, magma_s_matrix b, magma_s_matrix *x, magma_s_solver_par *solver_par, magma_queue_t queue) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. | |
magma_int_t | magma_silut_saad (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_silut_saad_apply (magma_s_matrix b, magma_s_matrix *x, magma_s_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_scustomilusetup (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Reads in an Incomplete LU preconditioner. | |
magma_int_t | magma_scustomicsetup (magma_s_matrix A, magma_s_matrix b, magma_s_preconditioner *precond, magma_queue_t queue) |
Reads in an Incomplete Cholesky preconditioner. | |
magma_int_t | magma_sbajac_csr (magma_int_t localiters, magma_s_matrix D, magma_s_matrix R, magma_s_matrix b, magma_s_matrix *x, magma_queue_t queue) |
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block. | |
magma_int_t | magma_sbajac_csr_overlap (magma_int_t localiters, magma_int_t matrices, magma_int_t overlap, magma_s_matrix *D, magma_s_matrix *R, magma_s_matrix b, magma_s_matrix *x, magma_queue_t queue) |
This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block. | |
magma_int_t | magma_smlumerge (magma_s_matrix L, magma_s_matrix U, magma_s_matrix *A, magma_queue_t queue) |
Takes an strictly lower triangular matrix L and an upper triangular matrix U and merges them into a matrix A containing the upper and lower triangular parts. | |
magma_int_t | magma_sgeaxpy (float alpha, magma_s_matrix X, float beta, magma_s_matrix *Y, magma_queue_t queue) |
This routine computes Y = alpha * X + beta * Y on the GPU. | |
magma_int_t | magma_sgecsrreimsplit (magma_s_matrix A, magma_s_matrix *ReA, magma_s_matrix *ImA, magma_queue_t queue) |
This routine takes an input matrix A in CSR format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A. | |
magma_int_t | magma_sgedensereimsplit (magma_s_matrix A, magma_s_matrix *ReA, magma_s_matrix *ImA, magma_queue_t queue) |
This routine takes an input matrix A in DENSE format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A. | |
magma_int_t | magma_sgecsr5mv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t p, float alpha, magma_int_t sigma, magma_int_t bit_y_offset, magma_int_t bit_scansum_offset, magma_int_t num_packet, magmaUIndex_ptr dtile_ptr, magmaUIndex_ptr dtile_desc, magmaIndex_ptr dtile_desc_offset_ptr, magmaIndex_ptr dtile_desc_offset, magmaFloat_ptr dcalibrator, magma_int_t tail_tile_start, magmaFloat_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaFloat_ptr dx, float beta, magmaFloat_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t | magma_scopyscale (magma_int_t n, magma_int_t k, magmaFloat_ptr dr, magmaFloat_ptr dv, magmaFloat_ptr dskp, magma_queue_t queue) |
Computes the correction term of the pipelined GMRES according to P. | |
magma_int_t | magma_snrm2scale (magma_int_t m, magmaFloat_ptr dr, magma_int_t lddr, float *drnorm, magma_queue_t queue) |
magma_int_t | magma_sjacobispmvupdate_bw (magma_int_t maxiter, magma_s_matrix A, magma_s_matrix t, magma_s_matrix b, magma_s_matrix d, magma_s_matrix *x, magma_queue_t queue) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d. | |
magma_int_t | magma_sjacobispmvupdateselect (magma_int_t maxiter, magma_int_t num_updates, magma_index_t *indices, magma_s_matrix A, magma_s_matrix t, magma_s_matrix b, magma_s_matrix d, magma_s_matrix tmp, magma_s_matrix *x, magma_queue_t queue) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d. | |
magma_int_t | magma_smergeblockkrylov (magma_int_t num_rows, magma_int_t num_cols, magmaFloat_ptr alpha, magmaFloat_ptr p, magmaFloat_ptr x, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_sbicgmerge1 (magma_int_t n, magmaFloat_ptr dskp, magmaFloat_ptr dv, magmaFloat_ptr dr, magmaFloat_ptr dp, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_sbicgmerge2 (magma_int_t n, magmaFloat_ptr dskp, magmaFloat_ptr dr, magmaFloat_ptr dv, magmaFloat_ptr ds, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_sbicgmerge3 (magma_int_t n, magmaFloat_ptr dskp, magmaFloat_ptr dp, magmaFloat_ptr ds, magmaFloat_ptr dt, magmaFloat_ptr dx, magmaFloat_ptr dr, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_sbicgmerge4 (magma_int_t type, magmaFloat_ptr dskp, magma_queue_t queue) |
Performs some parameter operations for the BiCGSTAB with scalars on GPU. | |
magma_int_t | magma_sbicgmerge_spmv1 (magma_s_matrix A, magmaFloat_ptr dd1, magmaFloat_ptr dd2, magmaFloat_ptr dp, magmaFloat_ptr dr, magmaFloat_ptr dv, magmaFloat_ptr dskp, magma_queue_t queue) |
Merges the first SpmV using CSR with the dot product and the computation of alpha. | |
magma_int_t | magma_sbicgmerge_spmv2 (magma_s_matrix A, magmaFloat_ptr dd1, magmaFloat_ptr dd2, magmaFloat_ptr ds, magmaFloat_ptr dt, magmaFloat_ptr dskp, magma_queue_t queue) |
Merges the second SpmV using CSR with the dot product and the computation of omega. | |
magma_int_t | magma_sbicgmerge_xrbeta (magma_int_t n, magmaFloat_ptr dd1, magmaFloat_ptr dd2, magmaFloat_ptr drr, magmaFloat_ptr dr, magmaFloat_ptr dp, magmaFloat_ptr ds, magmaFloat_ptr dt, magmaFloat_ptr dx, magmaFloat_ptr dskp, magma_queue_t queue) |
Merges the second SpmV using CSR with the dot product and the computation of omega. | |
magma_int_t | magma_sbcsrswp (magma_int_t n, magma_int_t size_b, magma_int_t *ipiv, magmaFloat_ptr dx, magma_queue_t queue) |
magma_int_t | magma_sbcsrtrsv (magma_uplo_t uplo, magma_int_t r_blocks, magma_int_t c_blocks, magma_int_t size_b, magmaFloat_ptr dA, magma_index_t *blockinfo, magmaFloat_ptr dx, magma_queue_t queue) |
magma_int_t | magma_sbcsrvalcpy (magma_int_t size_b, magma_int_t num_blocks, magma_int_t num_zero_blocks, magmaFloat_ptr *dAval, magmaFloat_ptr *dBval, magmaFloat_ptr *dBval2, magma_queue_t queue) |
magma_int_t | magma_sbcsrluegemm (magma_int_t size_b, magma_int_t num_block_rows, magma_int_t kblocks, magmaFloat_ptr *dA, magmaFloat_ptr *dB, magmaFloat_ptr *dC, magma_queue_t queue) |
magma_int_t | magma_sbcsrlupivloc (magma_int_t size_b, magma_int_t kblocks, magmaFloat_ptr *dA, magma_int_t *ipiv, magma_queue_t queue) |
magma_int_t | magma_sbcsrblockinfo5 (magma_int_t lustep, magma_int_t num_blocks, magma_int_t c_blocks, magma_int_t size_b, magma_index_t *blockinfo, magmaFloat_ptr dval, magmaFloat_ptr *AII, magma_queue_t queue) |
magma_int_t | magma_sthrsholdselect (magma_int_t sampling, magma_int_t total_size, magma_int_t subset_size, float *val, float *thrs, magma_queue_t queue) |
magma_int_t | magma_zorderstatistics (magmaDoubleComplex *val, magma_int_t length, magma_int_t k, magma_int_t r, magmaDoubleComplex *element, magma_queue_t queue) |
Identifies the kth smallest/largest element in an array. | |
magma_int_t | magma_zorderstatistics_inc (magmaDoubleComplex *val, magma_int_t length, magma_int_t k, magma_int_t inc, magma_int_t r, magmaDoubleComplex *element, magma_queue_t queue) |
Approximates the k-th smallest element in an array by using order-statistics with step-size inc. | |
magma_int_t | magma_zmorderstatistics (magmaDoubleComplex *val, magma_index_t *col, magma_index_t *row, magma_int_t length, magma_int_t k, magma_int_t r, magmaDoubleComplex *element, magma_queue_t queue) |
Identifies the kth smallest/largest element in an array and reorders such that these elements come to the front. | |
magma_int_t | magma_zpartition (magmaDoubleComplex *a, magma_int_t size, magma_int_t pivot, magma_queue_t queue) |
magma_int_t | magma_zmedian5 (magmaDoubleComplex *a, magma_queue_t queue) |
magma_int_t | magma_zselect (magmaDoubleComplex *a, magma_int_t size, magma_int_t k, magma_queue_t queue) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm. | |
magma_int_t | magma_zselectrandom (magmaDoubleComplex *a, magma_int_t size, magma_int_t k, magma_queue_t queue) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm. | |
magma_int_t | magma_zdomainoverlap (magma_index_t num_rows, magma_int_t *num_indices, magma_index_t *rowptr, magma_index_t *colidx, magma_index_t *x, magma_queue_t queue) |
Generates the update list. | |
magma_int_t | magma_zvspread (magma_z_matrix *x, const char *filename, magma_queue_t queue) |
Reads in a sparse vector-block stored in COO format. | |
magma_int_t | magma_zdiameter (magma_z_matrix *A, magma_queue_t queue) |
Computes the diameter of a sparse matrix and stores the value in diameter. | |
magma_int_t | magma_zparilusetup (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Prepares the ILU preconditioner via the iterative ILU iteration. | |
magma_int_t | magma_zparilu_gpu (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Generates an ILU(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_zparilu_cpu (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Generates an ILU(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_zparic_gpu (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Generates an IC(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_zparic_cpu (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Generates an IC(0) preconditer via fixed-point iterations. | |
magma_int_t | magma_zparicsetup (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Prepares the IC preconditioner via the iterative IC iteration. | |
magma_int_t | magma_zparicupdate (magma_z_matrix A, magma_z_preconditioner *precond, magma_int_t updates, magma_queue_t queue) |
Updates an existing preconditioner via additional iterative IC sweeps for previous factorization initial guess (PFIG). | |
magma_int_t | magma_zapplyiteric_l (magma_z_matrix b, magma_z_matrix *x, magma_z_preconditioner *precond, magma_queue_t queue) |
Performs the left triangular solves using the IC preconditioner via Jacobi. | |
magma_int_t | magma_zapplyiteric_r (magma_z_matrix b, magma_z_matrix *x, magma_z_preconditioner *precond, magma_queue_t queue) |
Performs the right triangular solves using the IC preconditioner via Jacobi. | |
magma_int_t | magma_zparilu_csr (magma_z_matrix A, magma_z_matrix L, magma_z_matrix U, magma_queue_t queue) |
This routine iteratively computes an incomplete LU factorization. | |
magma_int_t | magma_zpariluupdate (magma_z_matrix A, magma_z_preconditioner *precond, magma_int_t updates, magma_queue_t queue) |
Updates an existing preconditioner via additional iterative ILU sweeps for previous factorization initial guess (PFIG). | |
magma_int_t | magma_zparic_csr (magma_z_matrix A, magma_z_matrix A_CSR, magma_queue_t queue) |
This routine iteratively computes an incomplete LU factorization. | |
magma_int_t | magma_znonlinres (magma_z_matrix A, magma_z_matrix L, magma_z_matrix U, magma_z_matrix *LU, real_Double_t *res, magma_queue_t queue) |
Computes the nonlinear residual A - LU and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_zilures (magma_z_matrix A, magma_z_matrix L, magma_z_matrix U, magma_z_matrix *LU, real_Double_t *res, real_Double_t *nonlinres, magma_queue_t queue) |
Computes the ILU residual A - LU and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_zicres (magma_z_matrix A, magma_z_matrix C, magma_z_matrix CT, magma_z_matrix *LU, real_Double_t *res, real_Double_t *nonlinres, magma_queue_t queue) |
Computes the IC residual A - CC^T and returns the difference as well es the Frobenius norm of the difference. | |
magma_int_t | magma_zinitguess (magma_z_matrix A, magma_z_matrix *L, magma_z_matrix *U, magma_queue_t queue) |
Computes an initial guess for the ParILU/ParIC. | |
magma_int_t | magma_zinitrecursiveLU (magma_z_matrix A, magma_z_matrix *B, magma_queue_t queue) |
Using the iterative approach of computing ILU factorizations with increasing fill-in, it takes the input matrix A, containing the approximate factors, ( L and U as well ) computes a matrix with one higher level of fill-in, inserts the original approximation as initial guess, and provides the factors L and U also filled with the scaled initial guess. | |
magma_int_t | magma_zmLdiagadd (magma_z_matrix *L, magma_queue_t queue) |
Checks for a lower triangular matrix whether it is strictly lower triangular and in the negative case adds a unit diagonal. | |
magma_int_t | magma_zmatrix_cup (magma_z_matrix A, magma_z_matrix B, magma_z_matrix *U, magma_queue_t queue) |
Generates a matrix \(U = A \cup B\). | |
magma_int_t | magma_zmatrix_cup_gpu (magma_z_matrix A, magma_z_matrix B, magma_z_matrix *U, magma_queue_t queue) |
Generates a matrix \(U = A \cup B\). | |
magma_int_t | magma_zmatrix_cap (magma_z_matrix A, magma_z_matrix B, magma_z_matrix *U, magma_queue_t queue) |
Generates a matrix with entries being in both matrices: \(U = A \cap B\). | |
magma_int_t | magma_zmatrix_negcap (magma_z_matrix A, magma_z_matrix B, magma_z_matrix *U, magma_queue_t queue) |
Generates a list of matrix entries being part of A but not of B. | |
magma_int_t | magma_zmatrix_tril_negcap (magma_z_matrix A, magma_z_matrix B, magma_z_matrix *U, magma_queue_t queue) |
Generates a list of matrix entries being part of tril(A) but not of B. | |
magma_int_t | magma_zmatrix_triu_negcap (magma_z_matrix A, magma_z_matrix B, magma_z_matrix *U, magma_queue_t queue) |
Generates a matrix with entries being part of triu(A) but not of B. | |
magma_int_t | magma_zmatrix_abssum (magma_z_matrix A, double *sum, magma_queue_t queue) |
Computes the sum of the absolute values in a matrix. | |
magma_int_t | magma_zparilut_thrsrm (magma_int_t order, magma_z_matrix *A, double *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing. | |
magma_int_t | magma_zparilut_thrsrm_semilinked (magma_z_matrix *U, magma_z_matrix *UT, double *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller thrs from the matrix. | |
magma_int_t | magma_zparilut_rmselected (magma_z_matrix R, magma_z_matrix *A, magma_queue_t queue) |
Removes a selected list of elements from the matrix. | |
magma_int_t | magma_zparilut_selectoneperrow (magma_int_t order, magma_z_matrix *A, magma_z_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_zparilut_selecttwoperrow (magma_int_t order, magma_z_matrix *A, magma_z_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_zparilut_selectoneperrowthrs_lower (magma_z_matrix L, magma_z_matrix U, magma_z_matrix *A, double rtol, magma_z_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_zparilut_selectoneperrowthrs_upper (magma_z_matrix L, magma_z_matrix U, magma_z_matrix *A, double rtol, magma_z_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_zparilut_selectonepercol (magma_int_t order, magma_z_matrix *A, magma_z_matrix *oneA, magma_queue_t queue) |
magma_int_t | magma_zparilut_transpose_select_one (magma_z_matrix A, magma_z_matrix *B, magma_queue_t queue) |
This is a special routine with very limited scope. | |
magma_int_t | magma_zparilut_insert_LU (magma_int_t num_rm, magma_index_t *rm_loc, magma_index_t *rm_loc2, magma_z_matrix *LU_new, magma_z_matrix *L, magma_z_matrix *U, magma_queue_t queue) |
magma_int_t | magma_zparilut_set_thrs (magma_int_t num_rm, magma_z_matrix *LU, magma_int_t order, magmaDoubleComplex *thrs, magma_queue_t queue) |
magma_int_t | magma_zparilut_set_approx_thrs (magma_int_t num_rm, magma_z_matrix *LU, magma_int_t order, double *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_zparilut_set_thrs_randomselect (magma_int_t num_rm, magma_z_matrix *LU, magma_int_t order, double *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_zparilut_set_thrs_randomselect_approx (magma_int_t num_rm, magma_z_matrix *LU, magma_int_t order, double *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_zparilut_set_thrs_randomselect_factors (magma_int_t num_rm, magma_z_matrix *L, magma_z_matrix *U, magma_int_t order, double *thrs, magma_queue_t queue) |
This routine approximates the threshold for removing num_rm elements. | |
magma_int_t | magma_zparilut_set_exact_thrs (magma_int_t num_rm, magma_z_matrix *LU, magma_int_t order, magmaDoubleComplex *thrs, magma_queue_t queue) |
This routine provides the exact threshold for removing num_rm elements. | |
magma_int_t | magma_zparilut_set_approx_thrs_inc (magma_int_t num_rm, magma_z_matrix *LU, magma_int_t order, magmaDoubleComplex *thrs, magma_queue_t queue) |
This routine provides the exact threshold for removing num_rm elements. | |
magma_int_t | magma_zparilut_LU_approx_thrs (magma_int_t num_rm, magma_z_matrix *L, magma_z_matrix *U, magma_int_t order, magmaDoubleComplex *thrs, magma_queue_t queue) |
magma_int_t | magma_zparilut_reorder (magma_z_matrix *LU, magma_queue_t queue) |
This routine reorders the matrix (inplace) for easier access. | |
magma_int_t | magma_zparict_sweep (magma_z_matrix *A, magma_z_matrix *LU, magma_queue_t queue) |
magma_int_t | magma_zparilut_zero (magma_z_matrix *A, magma_queue_t queue) |
magma_int_t | magma_zparilu_sweep (magma_z_matrix A, magma_z_matrix *L, magma_z_matrix *U, magma_queue_t queue) |
This function does one asynchronous ParILU sweep. | |
magma_int_t | magma_zparilu_sweep_sync (magma_z_matrix A, magma_z_matrix *L, magma_z_matrix *U, magma_queue_t queue) |
This function does one synchronized ParILU sweep. | |
magma_int_t | magma_zparic_sweep (magma_z_matrix A, magma_z_matrix *L, magma_queue_t queue) |
This function does one asynchronous ParILU sweep (symmetric case). | |
magma_int_t | magma_zparic_sweep_sync (magma_z_matrix A, magma_z_matrix *L, magma_queue_t queue) |
This function does one synchronized ParILU sweep (symmetric case). | |
magma_int_t | magma_zparict_sweep_sync (magma_z_matrix *A, magma_z_matrix *L, magma_queue_t queue) |
This function does one synchronized ParILU sweep. | |
magma_int_t | magma_zparilut_sweep_sync (magma_z_matrix *A, magma_z_matrix *L, magma_z_matrix *U, magma_queue_t queue) |
This function does an ParILUT sweep. | |
magma_int_t | magma_zparilut_sweep_gpu (magma_z_matrix *A, magma_z_matrix *L, magma_z_matrix *U, magma_queue_t queue) |
This function does an ParILUT sweep. | |
magma_int_t | magma_zparilut_residuals_gpu (magma_z_matrix A, magma_z_matrix L, magma_z_matrix U, magma_z_matrix *R, magma_queue_t queue) |
This function computes the ILU residual in the locations included in the sparsity pattern of R. | |
magma_int_t | magma_zthrsholdrm_gpu (magma_int_t order, magma_z_matrix *A, double *thrs, magma_queue_t queue) |
magma_int_t | magma_zget_row_ptr (const magma_int_t num_rows, magma_int_t *nnz, const magma_index_t *rowidx, magma_index_t *rowptr, magma_queue_t queue) |
magma_int_t | magma_zparilut_align_residuals (magma_z_matrix L, magma_z_matrix U, magma_z_matrix *Lnew, magma_z_matrix *Unew, magma_queue_t queue) |
This function scales the residuals of a lower triangular factor L with the diagonal of U. | |
magma_int_t | magma_zparilut_preselect_scale (magma_z_matrix *L, magma_z_matrix *oneL, magma_z_matrix *U, magma_z_matrix *oneU, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_zparilut_thrsrm_U (magma_int_t order, magma_z_matrix L, magma_z_matrix *A, double *thrs, magma_queue_t queue) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing. | |
magma_int_t | magma_zparilut_residuals (magma_z_matrix A, magma_z_matrix L, magma_z_matrix U, magma_z_matrix *L_new, magma_queue_t queue) |
This function computes the ILU residual in the locations included in the sparsity pattern of R. | |
magma_int_t | magma_zparilut_residuals_transpose (magma_z_matrix A, magma_z_matrix L, magma_z_matrix U, magma_z_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_zparilut_residuals_semilinked (magma_z_matrix A, magma_z_matrix L, magma_z_matrix US, magma_z_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_zparilut_sweep_semilinked (magma_z_matrix *A, magma_z_matrix *L, magma_z_matrix *US, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_zparilut_sweep_list (magma_z_matrix *A, magma_z_matrix *L, magma_z_matrix *U, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_zparilut_residuals_list (magma_z_matrix A, magma_z_matrix L, magma_z_matrix U, magma_z_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_zparilut_sweep_linkedlist (magma_z_matrix *A, magma_z_matrix *L, magma_z_matrix *U, magma_queue_t queue) |
This function does an ParILU sweep. | |
magma_int_t | magma_zparilut_residuals_linkedlist (magma_z_matrix A, magma_z_matrix L, magma_z_matrix U, magma_z_matrix *L_new, magma_queue_t queue) |
This function computes the residuals. | |
magma_int_t | magma_zparilut_colmajor (magma_z_matrix A, magma_z_matrix *AC, magma_queue_t queue) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix. | |
magma_int_t | magma_zparilut_colmajorup (magma_z_matrix A, magma_z_matrix *AC, magma_queue_t queue) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix. | |
magma_int_t | magma_zparict (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Prepares the iterative threshold Incomplete Cholesky preconditioner. | |
magma_int_t | magma_zparict_cpu (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold Cholesky preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_zparilut (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Prepares the iterative threshold Incomplete LU preconditioner. | |
magma_int_t | magma_zparilut_cpu (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_zparilut_gpu (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_zparilut_gpu_nodp (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm. | |
magma_int_t | magma_zparilut_insert (magma_int_t *num_rmL, magma_int_t *num_rmU, magma_index_t *rm_locL, magma_index_t *rm_locU, magma_z_matrix *L_new, magma_z_matrix *U_new, magma_z_matrix *L, magma_z_matrix *U, magma_z_matrix *UR, magma_queue_t queue) |
Inserts for the iterative dynamic ILU an new element in the (empty) place. | |
magma_int_t | magma_zparilut_create_collinkedlist (magma_z_matrix A, magma_z_matrix *B, magma_queue_t queue) |
For the matrix U in CSR (row-major) this creates B containing a row-ptr to the columns and a linked list for the elements. | |
magma_int_t | magma_zparilut_candidates (magma_z_matrix L0, magma_z_matrix U0, magma_z_matrix L, magma_z_matrix U, magma_z_matrix *L_new, magma_z_matrix *U_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_zparilut_candidates_gpu (magma_z_matrix L0, magma_z_matrix U0, magma_z_matrix L, magma_z_matrix U, magma_z_matrix *L_new, magma_z_matrix *U_new, magma_queue_t queue) |
This function identifies the locations with a potential nonzero ILU residual R = A - L*U where L and U are the current incomplete factors. | |
magma_int_t | magma_zparict_candidates (magma_z_matrix L0, magma_z_matrix L, magma_z_matrix LT, magma_z_matrix *L_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_zparilut_candidates_semilinked (magma_z_matrix L0, magma_z_matrix U0, magma_z_matrix L, magma_z_matrix U, magma_z_matrix UT, magma_z_matrix *L_new, magma_z_matrix *U_new, magma_queue_t queue) |
This function identifies the candidates like they appear as ILU1 fill-in. | |
magma_int_t | magma_zparilut_candidates_linkedlist (magma_z_matrix L0, magma_z_matrix U0, magma_z_matrix L, magma_z_matrix U, magma_z_matrix UR, magma_z_matrix *L_new, magma_z_matrix *U_new, magma_queue_t queue) |
magma_int_t | magma_zparilut_rm_thrs (double *thrs, magma_int_t *num_rm, magma_z_matrix *LU, magma_z_matrix *LU_new, magma_index_t *rm_loc, magma_queue_t queue) |
This routine removes matrix entries from the structure that are smaller than the threshold. | |
magma_int_t | magma_zparilut_count (magma_z_matrix L, magma_int_t *num, magma_queue_t queue) |
This is a helper routine counting elements in a matrix in unordered Magma_CSRLIST format. | |
magma_int_t | magma_zparilut_randlist (magma_z_matrix *LU, magma_queue_t queue) |
magma_int_t | magma_zparilut_select_candidates_L (magma_int_t *num_rm, magma_index_t *rm_loc, magma_z_matrix *L_new, magma_queue_t queue) |
Screens the new candidates for multiple elements in the same row. | |
magma_int_t | magma_zparilut_select_candidates_U (magma_int_t *num_rm, magma_index_t *rm_loc, magma_z_matrix *L_new, magma_queue_t queue) |
Screens the new candidates for multiple elements in the same row. | |
magma_int_t | magma_zparilut_preselect (magma_int_t order, magma_z_matrix *A, magma_z_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_zpreselect_gpu (magma_int_t order, magma_z_matrix *A, magma_z_matrix *oneA, magma_queue_t queue) |
This function takes a list of candidates with residuals, and selects the largest in every row. | |
magma_int_t | magma_zsampleselect (magma_int_t total_size, magma_int_t subset_size, magmaDoubleComplex *val, double *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_zsampleselect_approx (magma_int_t total_size, magma_int_t subset_size, magmaDoubleComplex *val, double *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_zsampleselect_nodp (magma_int_t total_size, magma_int_t subset_size, magmaDoubleComplex *val, double *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_zsampleselect_approx_nodp (magma_int_t total_size, magma_int_t subset_size, magmaDoubleComplex *val, double *thrs, magma_ptr *tmp_ptr, magma_int_t *tmp_size, magma_queue_t queue) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_zmprepare_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_z_matrix L, magma_z_matrix LC, magma_index_t *sizes, magma_index_t *locations, magmaDoubleComplex *trisystems, magmaDoubleComplex *rhs, magma_queue_t queue) |
Takes a sparse matrix and generates an array containing the sizes of the different systems an array containing the indices with the locations in the sparse matrix where the data comes from and goes back to an array containing all the sparse triangular systems. | |
magma_int_t | magma_zmtrisolve_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_z_matrix L, magma_z_matrix LC, magma_index_t *sizes, magma_index_t *locations, magmaDoubleComplex *trisystems, magmaDoubleComplex *rhs, magma_queue_t queue) |
Does all triangular solves. | |
magma_int_t | magma_zmbackinsert_batched (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_z_matrix *M, magma_index_t *sizes, magma_index_t *locations, magmaDoubleComplex *trisystems, magmaDoubleComplex *rhs, magma_queue_t queue) |
Inserts the values into the preconditioner matrix. | |
magma_int_t | magma_zmprepare_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_z_matrix L, magma_z_matrix LC, magma_index_t *sizes, magma_index_t *locations, magmaDoubleComplex *trisystems, magmaDoubleComplex *rhs, magma_queue_t queue) |
This routine prepares the batch of small triangular systems that need to be solved for computing the ISAI preconditioner. | |
magma_int_t | magma_zmtrisolve_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_z_matrix L, magma_z_matrix LC, magma_index_t *sizes, magma_index_t *locations, magmaDoubleComplex *trisystems, magmaDoubleComplex *rhs, magma_queue_t queue) |
Does all triangular solves. | |
magma_int_t | magma_zmbackinsert_batched_gpu (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_z_matrix *M, magma_index_t *sizes, magma_index_t *locations, magmaDoubleComplex *trisystems, magmaDoubleComplex *rhs, magma_queue_t queue) |
Inserts the values into the preconditioner matrix. | |
magma_int_t | magma_ziluisaisetup_lower (magma_z_matrix L, magma_z_matrix S, magma_z_matrix *ISAIL, magma_queue_t queue) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves. | |
magma_int_t | magma_ziluisaisetup_upper (magma_z_matrix U, magma_z_matrix S, magma_z_matrix *ISAIU, magma_queue_t queue) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves. | |
magma_int_t | magma_zicisaisetup (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Prepares Incomplete Cholesky preconditioner using a sparse approximate inverse instead of sparse triangular solves. | |
magma_int_t | magma_zisai_l (magma_z_matrix b, magma_z_matrix *x, magma_z_preconditioner *precond, magma_queue_t queue) |
Left-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_zisai_r (magma_z_matrix b, magma_z_matrix *x, magma_z_preconditioner *precond, magma_queue_t queue) |
Right-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_zisai_l_t (magma_z_matrix b, magma_z_matrix *x, magma_z_preconditioner *precond, magma_queue_t queue) |
Left-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_zisai_r_t (magma_z_matrix b, magma_z_matrix *x, magma_z_preconditioner *precond, magma_queue_t queue) |
Right-hand-side application of ISAI preconditioner. | |
magma_int_t | magma_zmiluisai_sizecheck (magma_z_matrix A, magma_index_t batchsize, magma_index_t *maxsize, magma_queue_t queue) |
magma_int_t | magma_zgeisai_maxblock (magma_z_matrix L, magma_z_matrix *MT, magma_queue_t queue) |
This routine maximizes the pattern for the ISAI preconditioner. | |
magma_int_t | magma_zisai_generator_regs (magma_uplo_t uplotype, magma_trans_t transtype, magma_diag_t diagtype, magma_z_matrix L, magma_z_matrix *M, magma_queue_t queue) |
This routine is designet to combine all kernels into one. | |
magma_int_t | magma_zmsupernodal (magma_int_t *max_bs, magma_z_matrix A, magma_z_matrix *S, magma_queue_t queue) |
Generates a block-diagonal sparsity pattern with block-size bs. | |
magma_int_t | magma_zmvarsizeblockstruct (magma_int_t n, magma_int_t *bs, magma_int_t bsl, magma_uplo_t uplotype, magma_z_matrix *A, magma_queue_t queue) |
Generates a block-diagonal sparsity pattern with variable block-size. | |
magma_int_t | magma_ztfqmr_unrolled (magma_z_matrix A, magma_z_matrix b, magma_z_matrix *x, magma_z_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a complex matrix A. | |
magma_int_t | magma_zbicgstab_merge2 (magma_z_matrix A, magma_z_matrix b, magma_z_matrix *x, magma_z_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a general matrix. | |
magma_int_t | magma_zbicgstab_merge3 (magma_z_matrix A, magma_z_matrix b, magma_z_matrix *x, magma_z_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a general matrix. | |
magma_int_t | magma_zjacobidomainoverlap (magma_z_matrix A, magma_z_matrix b, magma_z_matrix *x, magma_z_solver_par *solver_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a complex Hermitian N-by-N positive definite matrix A. | |
magma_int_t | magma_zbaiter (magma_z_matrix A, magma_z_matrix b, magma_z_matrix *x, magma_z_solver_par *solver_par, magma_z_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU. | |
magma_int_t | magma_zbaiter_overlap (magma_z_matrix A, magma_z_matrix b, magma_z_matrix *x, magma_z_solver_par *solver_par, magma_z_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU. | |
magma_int_t | magma_zftjacobicontractions (magma_z_matrix xkm2, magma_z_matrix xkm1, magma_z_matrix xk, magma_z_matrix *z, magma_z_matrix *c, magma_queue_t queue) |
Computes the contraction coefficients c_i: | |
magma_int_t | magma_zftjacobiupdatecheck (double delta, magma_z_matrix *xold, magma_z_matrix *xnew, magma_z_matrix *zprev, magma_z_matrix c, magma_int_t *flag_t, magma_int_t *flag_fp, magma_queue_t queue) |
Checks the Jacobi updates accorting to the condition in the ScaLA'15 paper. | |
magma_int_t | magma_ziterref (magma_z_matrix A, magma_z_matrix b, magma_z_matrix *x, magma_z_solver_par *solver_par, magma_z_preconditioner *precond_par, magma_queue_t queue) |
Solves a system of linear equations A * X = B where A is a complex Hermitian N-by-N positive definite matrix A. | |
magma_int_t | magma_zjacobiiter_sys (magma_z_matrix A, magma_z_matrix b, magma_z_matrix d, magma_z_matrix t, magma_z_matrix *x, magma_z_solver_par *solver_par, magma_queue_t queue) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. | |
magma_int_t | magma_zftjacobi (magma_z_matrix A, magma_z_matrix b, magma_z_matrix *x, magma_z_solver_par *solver_par, magma_queue_t queue) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k. | |
magma_int_t | magma_zilut_saad (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_zilut_saad_apply (magma_z_matrix b, magma_z_matrix *x, magma_z_preconditioner *precond, magma_queue_t queue) |
magma_int_t | magma_zcustomilusetup (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Reads in an Incomplete LU preconditioner. | |
magma_int_t | magma_zcustomicsetup (magma_z_matrix A, magma_z_matrix b, magma_z_preconditioner *precond, magma_queue_t queue) |
Reads in an Incomplete Cholesky preconditioner. | |
magma_int_t | magma_zbajac_csr (magma_int_t localiters, magma_z_matrix D, magma_z_matrix R, magma_z_matrix b, magma_z_matrix *x, magma_queue_t queue) |
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block. | |
magma_int_t | magma_zbajac_csr_overlap (magma_int_t localiters, magma_int_t matrices, magma_int_t overlap, magma_z_matrix *D, magma_z_matrix *R, magma_z_matrix b, magma_z_matrix *x, magma_queue_t queue) |
This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block. | |
magma_int_t | magma_zmlumerge (magma_z_matrix L, magma_z_matrix U, magma_z_matrix *A, magma_queue_t queue) |
Takes an strictly lower triangular matrix L and an upper triangular matrix U and merges them into a matrix A containing the upper and lower triangular parts. | |
magma_int_t | magma_zgeaxpy (magmaDoubleComplex alpha, magma_z_matrix X, magmaDoubleComplex beta, magma_z_matrix *Y, magma_queue_t queue) |
This routine computes Y = alpha * X + beta * Y on the GPU. | |
magma_int_t | magma_zgecsrreimsplit (magma_z_matrix A, magma_z_matrix *ReA, magma_z_matrix *ImA, magma_queue_t queue) |
This routine takes an input matrix A in CSR format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A. | |
magma_int_t | magma_zgedensereimsplit (magma_z_matrix A, magma_z_matrix *ReA, magma_z_matrix *ImA, magma_queue_t queue) |
This routine takes an input matrix A in DENSE format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A. | |
magma_int_t | magma_zgecsr5mv (magma_trans_t transA, magma_int_t m, magma_int_t n, magma_int_t p, magmaDoubleComplex alpha, magma_int_t sigma, magma_int_t bit_y_offset, magma_int_t bit_scansum_offset, magma_int_t num_packet, magmaUIndex_ptr dtile_ptr, magmaUIndex_ptr dtile_desc, magmaIndex_ptr dtile_desc_offset_ptr, magmaIndex_ptr dtile_desc_offset, magmaDoubleComplex_ptr dcalibrator, magma_int_t tail_tile_start, magmaDoubleComplex_ptr dval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t | magma_zcopyscale (magma_int_t n, magma_int_t k, magmaDoubleComplex_ptr dr, magmaDoubleComplex_ptr dv, magmaDoubleComplex_ptr dskp, magma_queue_t queue) |
Computes the correction term of the pipelined GMRES according to P. | |
magma_int_t | magma_dznrm2scale (magma_int_t m, magmaDoubleComplex_ptr dr, magma_int_t lddr, magmaDoubleComplex *drnorm, magma_queue_t queue) |
magma_int_t | magma_zjacobispmvupdate_bw (magma_int_t maxiter, magma_z_matrix A, magma_z_matrix t, magma_z_matrix b, magma_z_matrix d, magma_z_matrix *x, magma_queue_t queue) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d. | |
magma_int_t | magma_zjacobispmvupdateselect (magma_int_t maxiter, magma_int_t num_updates, magma_index_t *indices, magma_z_matrix A, magma_z_matrix t, magma_z_matrix b, magma_z_matrix d, magma_z_matrix tmp, magma_z_matrix *x, magma_queue_t queue) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d. | |
magma_int_t | magma_zmergeblockkrylov (magma_int_t num_rows, magma_int_t num_cols, magmaDoubleComplex_ptr alpha, magmaDoubleComplex_ptr p, magmaDoubleComplex_ptr x, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_zbicgmerge1 (magma_int_t n, magmaDoubleComplex_ptr dskp, magmaDoubleComplex_ptr dv, magmaDoubleComplex_ptr dr, magmaDoubleComplex_ptr dp, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_zbicgmerge2 (magma_int_t n, magmaDoubleComplex_ptr dskp, magmaDoubleComplex_ptr dr, magmaDoubleComplex_ptr dv, magmaDoubleComplex_ptr ds, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_zbicgmerge3 (magma_int_t n, magmaDoubleComplex_ptr dskp, magmaDoubleComplex_ptr dp, magmaDoubleComplex_ptr ds, magmaDoubleComplex_ptr dt, magmaDoubleComplex_ptr dx, magmaDoubleComplex_ptr dr, magma_queue_t queue) |
Mergels multiple operations into one kernel: | |
magma_int_t | magma_zbicgmerge4 (magma_int_t type, magmaDoubleComplex_ptr dskp, magma_queue_t queue) |
Performs some parameter operations for the BiCGSTAB with scalars on GPU. | |
magma_int_t | magma_zbicgmerge_spmv1 (magma_z_matrix A, magmaDoubleComplex_ptr dd1, magmaDoubleComplex_ptr dd2, magmaDoubleComplex_ptr dp, magmaDoubleComplex_ptr dr, magmaDoubleComplex_ptr dv, magmaDoubleComplex_ptr dskp, magma_queue_t queue) |
Merges the first SpmV using CSR with the dot product and the computation of alpha. | |
magma_int_t | magma_zbicgmerge_spmv2 (magma_z_matrix A, magmaDoubleComplex_ptr dd1, magmaDoubleComplex_ptr dd2, magmaDoubleComplex_ptr ds, magmaDoubleComplex_ptr dt, magmaDoubleComplex_ptr dskp, magma_queue_t queue) |
Merges the second SpmV using CSR with the dot product and the computation of omega. | |
magma_int_t | magma_zbicgmerge_xrbeta (magma_int_t n, magmaDoubleComplex_ptr dd1, magmaDoubleComplex_ptr dd2, magmaDoubleComplex_ptr drr, magmaDoubleComplex_ptr dr, magmaDoubleComplex_ptr dp, magmaDoubleComplex_ptr ds, magmaDoubleComplex_ptr dt, magmaDoubleComplex_ptr dx, magmaDoubleComplex_ptr dskp, magma_queue_t queue) |
Merges the second SpmV using CSR with the dot product and the computation of omega. | |
magma_int_t | magma_zbcsrswp (magma_int_t n, magma_int_t size_b, magma_int_t *ipiv, magmaDoubleComplex_ptr dx, magma_queue_t queue) |
magma_int_t | magma_zbcsrtrsv (magma_uplo_t uplo, magma_int_t r_blocks, magma_int_t c_blocks, magma_int_t size_b, magmaDoubleComplex_ptr dA, magma_index_t *blockinfo, magmaDoubleComplex_ptr dx, magma_queue_t queue) |
For a Block-CSR ILU factorization, this routine performs the triangular solves. | |
magma_int_t | magma_zbcsrvalcpy (magma_int_t size_b, magma_int_t num_blocks, magma_int_t num_zero_blocks, magmaDoubleComplex_ptr *dAval, magmaDoubleComplex_ptr *dBval, magmaDoubleComplex_ptr *dBval2, magma_queue_t queue) |
For a Block-CSR ILU factorization, this routine copies the filled blocks from the original matrix A and initializes the blocks that will later be filled in the factorization process with zeros. | |
magma_int_t | magma_zbcsrluegemm (magma_int_t size_b, magma_int_t num_block_rows, magma_int_t kblocks, magmaDoubleComplex_ptr *dA, magmaDoubleComplex_ptr *dB, magmaDoubleComplex_ptr *dC, magma_queue_t queue) |
For a Block-CSR ILU factorization, this routine updates all blocks in the trailing matrix. | |
magma_int_t | magma_zbcsrlupivloc (magma_int_t size_b, magma_int_t kblocks, magmaDoubleComplex_ptr *dA, magma_int_t *ipiv, magma_queue_t queue) |
magma_int_t | magma_zbcsrblockinfo5 (magma_int_t lustep, magma_int_t num_blocks, magma_int_t c_blocks, magma_int_t size_b, magma_index_t *blockinfo, magmaDoubleComplex_ptr dval, magmaDoubleComplex_ptr *AII, magma_queue_t queue) |
For a Block-CSR ILU factorization, this routine copies the filled blocks from the original matrix A and initializes the blocks that will later be filled in the factorization process with zeros. | |
magma_int_t | magma_zthrsholdselect (magma_int_t sampling, magma_int_t total_size, magma_int_t subset_size, magmaDoubleComplex *val, double *thrs, magma_queue_t queue) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. | |
magma_int_t | magma_vector_zlag2c (magma_z_matrix x, magma_c_matrix *y, magma_queue_t queue) |
convertes magma_z_matrix from Z to C | |
magma_int_t | magma_sparse_matrix_zlag2c (magma_z_matrix A, magma_c_matrix *B, magma_queue_t queue) |
convertes magma_z_matrix from Z to C | |
magma_int_t | magma_vector_clag2z (magma_c_matrix x, magma_z_matrix *y, magma_queue_t queue) |
convertes magma_c_vector from C to Z | |
magma_int_t | magma_sparse_matrix_clag2z (magma_c_matrix A, magma_z_matrix *B, magma_queue_t queue) |
convertes magma_c_sparse_matrix from C to Z | |
void | magmablas_zlag2c_sparse (magma_int_t M, magma_int_t N, magmaDoubleComplex_const_ptr dA, magma_int_t lda, magmaFloatComplex_ptr dSA, magma_int_t ldsa, magma_queue_t queue, magma_int_t *info) |
void | magmablas_clag2z_sparse (magma_int_t M, magma_int_t N, magmaFloatComplex_const_ptr dSA, magma_int_t ldsa, magmaDoubleComplex_ptr dA, magma_int_t lda, magma_queue_t queue, magma_int_t *info) |
void | magma_zlag2c_CSR_DENSE (magma_z_matrix A, magma_c_matrix *B, magma_queue_t queue) |
void | magma_zlag2c_CSR_DENSE_alloc (magma_z_matrix A, magma_c_matrix *B, magma_queue_t queue) |
void | magma_zlag2c_CSR_DENSE_convert (magma_z_matrix A, magma_c_matrix *B, magma_queue_t queue) |
magma_int_t | magma_zcgecsrmv_mixed_prec (magma_trans_t transA, magma_int_t m, magma_int_t n, magmaDoubleComplex alpha, magmaDoubleComplex_ptr ddiagval, magmaFloatComplex_ptr doffdiagval, magmaIndex_ptr drowptr, magmaIndex_ptr dcolind, magmaDoubleComplex_ptr dx, magmaDoubleComplex beta, magmaDoubleComplex_ptr dy, magma_queue_t queue) |
This routine computes y = alpha * A * x + beta * y on the GPU. | |
magma_int_t magma_corderstatistics | ( | magmaFloatComplex * | val, |
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | r, | ||
magmaFloatComplex * | element, | ||
magma_queue_t | queue ) |
Identifies the kth smallest/largest element in an array.
[in,out] | val | magmaFloatComplex* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | magmaFloatComplex* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_corderstatistics_inc | ( | magmaFloatComplex * | val, |
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | inc, | ||
magma_int_t | r, | ||
magmaFloatComplex * | element, | ||
magma_queue_t | queue ) |
Approximates the k-th smallest element in an array by using order-statistics with step-size inc.
[in,out] | val | magmaFloatComplex* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | inc | magma_int_t Stepsize in the approximation. |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | magmaFloatComplex* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmorderstatistics | ( | magmaFloatComplex * | val, |
magma_index_t * | col, | ||
magma_index_t * | row, | ||
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | r, | ||
magmaFloatComplex * | element, | ||
magma_queue_t | queue ) |
Identifies the kth smallest/largest element in an array and reorders such that these elements come to the front.
The related arrays col and row are also reordered.
[in,out] | val | magmaFloatComplex* Target array, will be modified during operation. |
[in,out] | col | magma_index_t* Target array, will be modified during operation. |
[in,out] | row | magma_index_t* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | magmaFloatComplex* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cpartition | ( | magmaFloatComplex * | a, |
magma_int_t | size, | ||
magma_int_t | pivot, | ||
magma_queue_t | queue ) |
magma_int_t magma_cmedian5 | ( | magmaFloatComplex * | a, |
magma_queue_t | queue ) |
magma_int_t magma_cselect | ( | magmaFloatComplex * | a, |
magma_int_t | size, | ||
magma_int_t | k, | ||
magma_queue_t | queue ) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm.
Derrick Coetzee, webma.nosp@m.ster.nosp@m.@moon.nosp@m.flar.nosp@m.e.com January 22, 2004 http://moonflare.com/code/select/select.pdf
[in,out] | a | magmaFloatComplex* array to select from |
[in] | size | magma_int_t size of array |
[in] | k | magma_int_t k-th smallest element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cselectrandom | ( | magmaFloatComplex * | a, |
magma_int_t | size, | ||
magma_int_t | k, | ||
magma_queue_t | queue ) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm.
Derrick Coetzee, webma.nosp@m.ster.nosp@m.@moon.nosp@m.flar.nosp@m.e.com January 22, 2004 http://moonflare.com/code/select/select.pdf
[in,out] | a | magmaFloatComplex* array to select from |
[in] | size | magma_int_t size of array |
[in] | k | magma_int_t k-th smallest element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cdomainoverlap | ( | magma_index_t | num_rows, |
magma_int_t * | num_indices, | ||
magma_index_t * | rowptr, | ||
magma_index_t * | colidx, | ||
magma_index_t * | x, | ||
magma_queue_t | queue ) |
Generates the update list.
[in] | x | magma_index_t* array to sort |
[in] | num_rows | magma_int_t number of rows in matrix |
[out] | num_indices | magma_int_t* number of indices in array |
[in] | rowptr | magma_index_t* rowpointer of matrix |
[in] | colidx | magma_index_t* colindices of matrix |
[in] | x | magma_index_t* array containing indices for domain overlap |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cvspread | ( | magma_c_matrix * | x, |
const char * | filename, | ||
magma_queue_t | queue ) |
Reads in a sparse vector-block stored in COO format.
[out] | x | magma_c_matrix * vector to read in |
[in] | filename | char* file where vector is stored |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cdiameter | ( | magma_c_matrix * | A, |
magma_queue_t | queue ) |
Computes the diameter of a sparse matrix and stores the value in diameter.
[in,out] | A | magma_c_matrix* sparse matrix |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilusetup | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the ILU preconditioner via the iterative ILU iteration.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilu_gpu | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an ILU(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the GPU implementation of the ParILU
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilu_cpu | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an ILU(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the CPU implementation of the ParILU
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparic_gpu | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an IC(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the GPU implementation of the ParIC
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparic_cpu | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an IC(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the CPU implementation of the ParIC
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparicsetup | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the IC preconditioner via the iterative IC iteration.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparicupdate | ( | magma_c_matrix | A, |
magma_c_preconditioner * | precond, | ||
magma_int_t | updates, | ||
magma_queue_t | queue ) |
Updates an existing preconditioner via additional iterative IC sweeps for previous factorization initial guess (PFIG).
See Anzt et al., Parallel Computing, 2015.
[in] | A | magma_c_matrix input matrix A, current target system |
[in] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | updates | magma_int_t number of updates |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_capplyiteric_l | ( | magma_c_matrix | b, |
magma_c_matrix * | x, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Performs the left triangular solves using the IC preconditioner via Jacobi.
[in] | b | magma_c_matrix RHS |
[out] | x | magma_c_matrix* vector to precondition |
[in] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_capplyiteric_r | ( | magma_c_matrix | b, |
magma_c_matrix * | x, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Performs the right triangular solves using the IC preconditioner via Jacobi.
[in] | b | magma_c_matrix RHS |
[out] | x | magma_c_matrix* vector to precondition |
[in] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilu_csr | ( | magma_c_matrix | A, |
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_queue_t | queue ) |
This routine iteratively computes an incomplete LU factorization.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1-16, 2015.
The input format of the system matrix is COO, the lower triangular factor L is stored in CSR, the upper triangular factor U is transposed, then also stored in CSR (equivalent to CSC format for the non-transposed U). Every component of L and U is handled by one thread.
[in] | A | magma_c_matrix input matrix A determing initial guess & processing order |
[in,out] | L | magma_c_matrix input/output matrix L containing the lower triangular factor |
[in,out] | U | magma_c_matrix input/output matrix U containing the upper triangular factor |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cpariluupdate | ( | magma_c_matrix | A, |
magma_c_preconditioner * | precond, | ||
magma_int_t | updates, | ||
magma_queue_t | queue ) |
Updates an existing preconditioner via additional iterative ILU sweeps for previous factorization initial guess (PFIG).
See Anzt et al., Parallel Computing, 2015.
[in] | A | magma_c_matrix input matrix A, current target system |
[in] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | updates | magma_int_t number of updates |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparic_csr | ( | magma_c_matrix | A, |
magma_c_matrix | A_CSR, | ||
magma_queue_t | queue ) |
This routine iteratively computes an incomplete LU factorization.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1-16, 2015.
The input format of the initial guess matrix A is Magma_CSRCOO, A_CSR is CSR or CSRCOO format.
[in] | A | magma_c_matrix input matrix A - initial guess (lower triangular) |
[in,out] | A_CSR | magma_c_matrix input/output matrix containing the IC approximation |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cnonlinres | ( | magma_c_matrix | A, |
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix * | LU, | ||
real_Double_t * | res, | ||
magma_queue_t | queue ) |
Computes the nonlinear residual A - LU and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_c_matrix input sparse matrix in CSR |
[in] | L | magma_c_matrix input sparse matrix in CSR |
[in] | U | magma_c_matrix input sparse matrix in CSR |
[out] | LU | magma_c_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* Frobenius norm of difference |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cilures | ( | magma_c_matrix | A, |
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix * | LU, | ||
real_Double_t * | res, | ||
real_Double_t * | nonlinres, | ||
magma_queue_t | queue ) |
Computes the ILU residual A - LU and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_c_matrix input sparse matrix in CSR |
[in] | L | magma_c_matrix input sparse matrix in CSR |
[in] | U | magma_c_matrix input sparse matrix in CSR |
[out] | LU | magma_c_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* Frobenius norm of difference |
[out] | nonlinres | real_Double_t* Frobenius norm of difference |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cicres | ( | magma_c_matrix | A, |
magma_c_matrix | C, | ||
magma_c_matrix | CT, | ||
magma_c_matrix * | LU, | ||
real_Double_t * | res, | ||
real_Double_t * | nonlinres, | ||
magma_queue_t | queue ) |
Computes the IC residual A - CC^T and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_c_matrix input sparse matrix in CSR |
[in] | C | magma_c_matrix input sparse matrix in CSR |
[in] | CT | magma_c_matrix input sparse matrix in CSR |
[in] | LU | magma_c_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* IC residual |
[out] | nonlinres | real_Double_t* nonlinear residual |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cinitguess | ( | magma_c_matrix | A, |
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
Computes an initial guess for the ParILU/ParIC.
[in] | A | magma_c_matrix sparse matrix in CSR |
[out] | L | magma_c_matrix* sparse matrix in CSR |
[out] | U | magma_c_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cinitrecursiveLU | ( | magma_c_matrix | A, |
magma_c_matrix * | B, | ||
magma_queue_t | queue ) |
Using the iterative approach of computing ILU factorizations with increasing fill-in, it takes the input matrix A, containing the approximate factors, ( L and U as well ) computes a matrix with one higher level of fill-in, inserts the original approximation as initial guess, and provides the factors L and U also filled with the scaled initial guess.
[in] | A | magma_c_matrix* sparse matrix in CSR |
[out] | B | magma_c_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmLdiagadd | ( | magma_c_matrix * | L, |
magma_queue_t | queue ) |
Checks for a lower triangular matrix whether it is strictly lower triangular and in the negative case adds a unit diagonal.
It does this in-place.
[in,out] | L | magma_c_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmatrix_cup | ( | magma_c_matrix | A, |
magma_c_matrix | B, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix \(U = A \cup B\).
If both matrices have a nonzero value in the same location, the value of A is used.
[in] | A | magma_c_matrix Input matrix 1. |
[in] | B | magma_c_matrix Input matrix 2. |
[out] | U | magma_c_matrix* Not a real matrix, but the list of all matrix entries included in either A or B. No duplicates. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmatrix_cup_gpu | ( | magma_c_matrix | A, |
magma_c_matrix | B, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix \(U = A \cup B\).
If both matrices have a nonzero value in the same location, the value of A is used.
This is the GPU version of the operation.
[in] | A | magma_c_matrix Input matrix 1. |
[in] | B | magma_c_matrix Input matrix 2. |
[out] | U | magma_c_matrix* \(U = A \cup B\). If both matrices have a nonzero value in the same location, the value of A is used. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmatrix_cap | ( | magma_c_matrix | A, |
magma_c_matrix | B, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix with entries being in both matrices: \(U = A \cap B\).
The values in U are all ones.
[in] | A | magma_c_matrix Input matrix 1. |
[in] | B | magma_c_matrix Input matrix 2. |
[out] | U | magma_c_matrix* Not a real matrix, but the list of all matrix entries included in both A and B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmatrix_negcap | ( | magma_c_matrix | A, |
magma_c_matrix | B, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a list of matrix entries being part of A but not of B.
U = A \ B The values of A are preserved.
[in] | A | magma_c_matrix Element part of this. |
[in,out] | B | magma_c_matrix Not part of this. |
[out] | U | magma_c_matrix* Not a real matrix, but the list of all matrix entries included in A not in B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmatrix_tril_negcap | ( | magma_c_matrix | A, |
magma_c_matrix | B, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a list of matrix entries being part of tril(A) but not of B.
U = tril(A) \ B The values of A are preserved.
[in] | A | magma_c_matrix Element part of this. |
[in,out] | B | magma_c_matrix Not part of this. |
[out] | U | magma_c_matrix* Not a real matrix, but the list of all matrix entries included in A not in B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmatrix_triu_negcap | ( | magma_c_matrix | A, |
magma_c_matrix | B, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix with entries being part of triu(A) but not of B.
U = triu(A) \ B The values of A are preserved.
[in] | A | magma_c_matrix Element part of this. |
[in] | B | magma_c_matrix Not part of this. |
[out] | U | magma_c_matrix* |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmatrix_abssum | ( | magma_c_matrix | A, |
float * | sum, | ||
magma_queue_t | queue ) |
Computes the sum of the absolute values in a matrix.
[in] | A | magma_c_matrix Element list/matrix. |
[out] | sum | float* Sum of the absolute values. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_thrsrm | ( | magma_int_t | order, |
magma_c_matrix * | A, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing.
[in] | order | magma_int_t order == 1: all elements smaller are discarded order == 0: all elements larger are discarded |
[in,out] | A | magma_c_matrix* Matrix where elements are removed. |
[in] | thrs | float* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_thrsrm_semilinked | ( | magma_c_matrix * | U, |
magma_c_matrix * | US, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller thrs from the matrix.
It only uses the linked list and skips the `‘removed’' elements
[in,out] | A | magma_c_matrix* Matrix where elements are removed. |
[in] | thrs | float* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_rmselected | ( | magma_c_matrix | R, |
magma_c_matrix * | A, | ||
magma_queue_t | queue ) |
Removes a selected list of elements from the matrix.
[in] | R | magma_c_matrix Matrix containing elements to be removed. |
[in,out] | A | magma_c_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_selectoneperrow | ( | magma_int_t | order, |
magma_c_matrix * | A, | ||
magma_c_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==1 -> largest order==0 -> smallest |
[in] | A | magma_c_matrix* Matrix where elements are removed. |
[out] | oneA | magma_c_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_selecttwoperrow | ( | magma_int_t | order, |
magma_c_matrix * | A, | ||
magma_c_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==1 -> largest order==0 -> smallest |
[in] | A | magma_c_matrix* Matrix where elements are removed. |
[out] | oneA | magma_c_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_selectoneperrowthrs_lower | ( | magma_c_matrix | L, |
magma_c_matrix | U, | ||
magma_c_matrix * | A, | ||
float | rtol, | ||
magma_c_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_c_matrix Current lower triangular factor. |
[in] | U | magma_c_matrix Current upper triangular factor. |
[in] | A | magma_c_matrix* All residuals in L. |
[in] | rtol | threshold rtol |
[out] | oneA | magma_c_matrix* at most one per row, if larger thrs. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_selectoneperrowthrs_upper | ( | magma_c_matrix | L, |
magma_c_matrix | U, | ||
magma_c_matrix * | A, | ||
float | rtol, | ||
magma_c_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_c_matrix Current lower triangular factor. |
[in] | U | magma_c_matrix Current upper triangular factor. |
[in] | A | magma_c_matrix* All residuals in L. |
[in] | rtol | threshold rtol |
[out] | oneA | magma_c_matrix* at most one per row, if larger thrs. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_selectonepercol | ( | magma_int_t | order, |
magma_c_matrix * | A, | ||
magma_c_matrix * | oneA, | ||
magma_queue_t | queue ) |
magma_int_t magma_cparilut_transpose_select_one | ( | magma_c_matrix | A, |
magma_c_matrix * | B, | ||
magma_queue_t | queue ) |
This is a special routine with very limited scope.
For a set of fill-in candidates in row-major format, it transposes the a submatrix, i.e. the submatrix consisting of the largest element in every column. This function is only useful for delta<=1.
[in] | A | magma_c_matrix Matrix to transpose. |
[out] | B | magma_c_matrix* Transposed matrix containing only largest elements in each col. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_insert_LU | ( | magma_int_t | num_rm, |
magma_index_t * | rm_loc, | ||
magma_index_t * | rm_loc2, | ||
magma_c_matrix * | LU_new, | ||
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
magma_int_t magma_cparilut_set_thrs | ( | magma_int_t | num_rm, |
magma_c_matrix * | LU, | ||
magma_int_t | order, | ||
magmaFloatComplex * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_cparilut_set_approx_thrs | ( | magma_int_t | num_rm, |
magma_c_matrix * | LU, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_c_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | magmaFloatComplex* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_set_thrs_randomselect | ( | magma_int_t | num_rm, |
magma_c_matrix * | LU, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_c_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | float* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_set_thrs_randomselect_approx | ( | magma_int_t | num_rm, |
magma_c_matrix * | LU, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_c_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | float* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_set_thrs_randomselect_factors | ( | magma_int_t | num_rm, |
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_c_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | float* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_set_exact_thrs | ( | magma_int_t | num_rm, |
magma_c_matrix * | LU, | ||
magma_int_t | order, | ||
magmaFloatComplex * | thrs, | ||
magma_queue_t | queue ) |
This routine provides the exact threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_c_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | magmaFloatComplex* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_set_approx_thrs_inc | ( | magma_int_t | num_rm, |
magma_c_matrix * | LU, | ||
magma_int_t | order, | ||
magmaFloatComplex * | thrs, | ||
magma_queue_t | queue ) |
This routine provides the exact threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_c_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | magmaFloatComplex* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_LU_approx_thrs | ( | magma_int_t | num_rm, |
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_int_t | order, | ||
magmaFloatComplex * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_cparilut_reorder | ( | magma_c_matrix * | LU, |
magma_queue_t | queue ) |
This routine reorders the matrix (inplace) for easier access.
[in] | LU | magma_c_matrix* Current ILU approximation. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparict_sweep | ( | magma_c_matrix * | A, |
magma_c_matrix * | LU, | ||
magma_queue_t | queue ) |
magma_int_t magma_cparilut_zero | ( | magma_c_matrix * | A, |
magma_queue_t | queue ) |
magma_int_t magma_cparilu_sweep | ( | magma_c_matrix | A, |
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
This function does one asynchronous ParILU sweep.
Input and output array are identical.
[in] | A | magma_c_matrix System matrix in COO. |
[in] | L | magma_c_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_c_matrix* Current approximation for the upper triangular factor The format is sorted CSC (U^T in CSR). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilu_sweep_sync | ( | magma_c_matrix | A, |
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep.
Input and output are different arrays.
[in] | A | magma_c_matrix System matrix in COO. |
[in] | L | magma_c_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_c_matrix* Current approximation for the upper triangular factor The format is sorted CSC (U^T in CSR). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparic_sweep | ( | magma_c_matrix | A, |
magma_c_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one asynchronous ParILU sweep (symmetric case).
Input and output array is identical.
[in] | A | magma_c_matrix System matrix in COO. |
[in] | L | magma_c_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparic_sweep_sync | ( | magma_c_matrix | A, |
magma_c_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep (symmetric case).
Input and output are different arrays.
[in] | A | magma_c_matrix System matrix in COO. |
[in] | L | magma_c_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparict_sweep_sync | ( | magma_c_matrix * | A, |
magma_c_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep.
Input and output are different arrays.
[in] | A | magma_c_matrix* System matrix. |
[in] | L | magma_c_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_c_matrix* Current approximation for the upper triangular factor The format is sorted CSC. |
[out] | L_new | magma_c_matrix* Current approximation for the lower triangular factor The format is unsorted CSR. |
[out] | U_new | magma_c_matrix* Current approximation for the upper triangular factor The format is unsorted CSC. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_sweep_sync | ( | magma_c_matrix * | A, |
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILUT sweep.
The difference to the ParILU sweep is that the nonzero pattern of A and the incomplete factors L and U can be different. The pattern determing which elements are iterated are hence the pattern of L and U, not A.
This is the CPU version of the synchronous ParILUT sweep.
[in] | A | magma_c_matrix* System matrix. The format is sorted CSR. |
[in,out] | L | magma_c_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | U | magma_c_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_sweep_gpu | ( | magma_c_matrix * | A, |
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILUT sweep.
The difference to the ParILU sweep is that the nonzero pattern of A and the incomplete factors L and U can be different. The pattern determing which elements are iterated are hence the pattern of L and U, not A. L has a unit diagonal.
This is the GPU version of the asynchronous ParILUT sweep.
[in] | A | magma_c_matrix* System matrix. The format is sorted CSR. |
[in,out] | L | magma_c_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | U | magma_c_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_residuals_gpu | ( | magma_c_matrix | A, |
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix * | R, | ||
magma_queue_t | queue ) |
This function computes the ILU residual in the locations included in the sparsity pattern of R.
[in] | A | magma_c_matrix System matrix. The format is sorted CSR. |
[in] | L | magma_c_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | U | magma_c_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | R | magma_c_matrix* Sparsity pattern on which the ILU residual is computed. R is in COO format. On output, R contains the ILU residual. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cthrsholdrm_gpu | ( | magma_int_t | order, |
magma_c_matrix * | A, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
Purpose
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. @param[in] order magma_int_t dummy variable for now. @param[in,out] A magma_c_matrix* input/output matrix where elements are removed @param[out] thrs float* computed threshold @param[in] queue magma_queue_t Queue to execute in. @ingroup magmasparse_caux
magma_int_t magma_cget_row_ptr | ( | const magma_int_t | num_rows, |
magma_int_t * | nnz, | ||
const magma_index_t * | rowidx, | ||
magma_index_t * | rowptr, | ||
magma_queue_t | queue ) |
magma_int_t magma_cparilut_align_residuals | ( | magma_c_matrix | L, |
magma_c_matrix | U, | ||
magma_c_matrix * | Lnew, | ||
magma_c_matrix * | Unew, | ||
magma_queue_t | queue ) |
This function scales the residuals of a lower triangular factor L with the diagonal of U.
The intention is to generate a good initial guess for inserting the elements.
[in] | L | magma_c_matrix Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_c_matrix Current approximation for the upper triangular factor The format is sorted CSC. |
[in] | hL | magma_c_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | hU | magma_c_matrix* Current approximation for the upper triangular factor The format is sorted CSC. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_preselect_scale | ( | magma_c_matrix * | L, |
magma_c_matrix * | oneL, | ||
magma_c_matrix * | U, | ||
magma_c_matrix * | oneU, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_c_matrix* Matrix where elements are removed. |
[in] | U | magma_c_matrix* Matrix where elements are removed. |
[out] | oneL | magma_c_matrix* Matrix where elements are removed. |
[out] | oneU | magma_c_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_thrsrm_U | ( | magma_int_t | order, |
magma_c_matrix | L, | ||
magma_c_matrix * | A, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing.
[in] | order | magma_int_t order == 1: all elements smaller are discarded order == 0: all elements larger are discarded |
[in,out] | A | magma_c_matrix* Matrix where elements are removed. |
[in] | thrs | float* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_residuals | ( | magma_c_matrix | A, |
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix * | R, | ||
magma_queue_t | queue ) |
This function computes the ILU residual in the locations included in the sparsity pattern of R.
[in] | A | magma_c_matrix System matrix A. |
[in] | L | magma_c_matrix Current approximation for the lower triangular factor. The format is sorted CSR. |
[in] | U | magma_c_matrix Current approximation for the upper triangular factor. The format is sorted CSR. |
[in,out] | R | magma_c_matrix* Sparsity pattern on which the ILU residual is computed. R is in COO format. On output, R contains the ILU residual. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_residuals_transpose | ( | magma_c_matrix | A, |
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_c_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_c_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_residuals_semilinked | ( | magma_c_matrix | A, |
magma_c_matrix | L, | ||
magma_c_matrix | US, | ||
magma_c_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_c_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_c_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_sweep_semilinked | ( | magma_c_matrix * | A, |
magma_c_matrix * | L, | ||
magma_c_matrix * | US, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_c_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_c_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_c_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_sweep_list | ( | magma_c_matrix * | A, |
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_c_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_c_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_c_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_residuals_list | ( | magma_c_matrix | A, |
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_c_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_c_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_sweep_linkedlist | ( | magma_c_matrix * | A, |
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_c_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_c_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_c_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_residuals_linkedlist | ( | magma_c_matrix | A, |
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_c_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_c_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_colmajor | ( | magma_c_matrix | A, |
magma_c_matrix * | AC, | ||
magma_queue_t | queue ) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix.
[in] | A | magma_c_matrix The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | AC | magma_c_matrix* The matrix A but with row-pointer being for col-major, same with linked list. The values, col and row indices are unchanged. The respective pointers point to the entities of A. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_colmajorup | ( | magma_c_matrix | A, |
magma_c_matrix * | AC, | ||
magma_queue_t | queue ) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix.
[in] | A | magma_c_matrix The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | AC | magma_c_matrix* The matrix A but with row-pointer being for col-major, same with linked list. The values, col and row indices are unchanged. The respective pointers point to the entities of A. Already allocated. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparict | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the iterative threshold Incomplete Cholesky preconditioner.
The strategy is interleaving a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this new algorithm has fine-grained parallelism, and we show that it can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2016.
This function requires OpenMP, and is only available if OpenMP is activated.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparict_cpu | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold Cholesky preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern. It is the variant for SPD systems.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the iterative threshold Incomplete LU preconditioner.
The strategy is interleaving a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this new algorithm has fine-grained parallelism, and we show that it can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz constant) precond.rtol : how many candidates are added to the sparsity pattern 1.0 one per row < 1.0 a fraction of those > 1.0 all candidates
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_cpu | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_gpu | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_gpu_nodp | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
This routine is the same as magma_cparilut_gpu(), except that it uses no dynamic paralellism
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_insert | ( | magma_int_t * | num_rmL, |
magma_int_t * | num_rmU, | ||
magma_index_t * | rm_locL, | ||
magma_index_t * | rm_locU, | ||
magma_c_matrix * | L_new, | ||
magma_c_matrix * | U_new, | ||
magma_c_matrix * | L, | ||
magma_c_matrix * | U, | ||
magma_c_matrix * | UR, | ||
magma_queue_t | queue ) |
Inserts for the iterative dynamic ILU an new element in the (empty) place.
[in] | num_rmL | magma_int_t Number of Elements that are replaced in L. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced in U. |
[in] | rm_locL | magma_index_t* List containing the locations of the deleted elements. |
[in] | rm_locU | magma_index_t* List containing the locations of the deleted elements. |
[in] | L_new | magma_c_matrix Elements that will be inserted in L stored in COO format (unsorted). |
[in] | U_new | magma_c_matrix Elements that will be inserted in U stored in COO format (unsorted). |
[in,out] | L | magma_c_matrix* matrix where new elements are inserted. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_c_matrix* matrix where new elements are inserted. Row-major. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in,out] | UR | magma_c_matrix* Same matrix as U, but column-major. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_create_collinkedlist | ( | magma_c_matrix | A, |
magma_c_matrix * | B, | ||
magma_queue_t | queue ) |
For the matrix U in CSR (row-major) this creates B containing a row-ptr to the columns and a linked list for the elements.
[in] | A | magma_c_matrix Matrix to transpose. |
[out] | B | magma_c_matrix* Transposed matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_candidates | ( | magma_c_matrix | L0, |
magma_c_matrix | U0, | ||
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix * | L_new, | ||
magma_c_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_c_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_c_matrix triu( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_c_matrix Current lower triangular factor. |
[in] | U | magma_c_matrix Current upper triangular factor. |
[in,out] | LU_new | magma_c_matrix* List of candidates for L in COO format. |
[in,out] | LU_new | magma_c_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_candidates_gpu | ( | magma_c_matrix | L0, |
magma_c_matrix | U0, | ||
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix * | L_new, | ||
magma_c_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the locations with a potential nonzero ILU residual R = A - L*U where L and U are the current incomplete factors.
Nonzero ILU residuals are possible 1 where A is nonzero but L and U have no nonzero entry 2 where the product L*U has fill-in but the location is not included in L or U
We assume that the incomplete factors are exact fro the elements included in the current pattern.
This is the GPU implementation of the candidate search.
2 GPU kernels are used: the first is a dry run assessing the memory need, the second then computes the candidate locations, the third eliminates float entries. The fourth kernel ensures the elements in a row are sorted for increasing column index.
[in] | L0 | magma_c_matrix tril(ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_c_matrix triu(ILU(0) ) pattern of original system matrix. |
[in] | L | magma_c_matrix Current lower triangular factor. |
[in] | U | magma_c_matrix Current upper triangular factor. |
[in,out] | L_new | magma_c_matrix* List of candidates for L in COO format. |
[in,out] | U_new | magma_c_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparict_candidates | ( | magma_c_matrix | L0, |
magma_c_matrix | L, | ||
magma_c_matrix | LT, | ||
magma_c_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_c_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_c_matrix Current lower triangular factor. |
[in] | LT | magma_c_matrix Transose of the lower triangular factor. |
[in,out] | L_new | magma_c_matrix* List of candidates for L in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_candidates_semilinked | ( | magma_c_matrix | L0, |
magma_c_matrix | U0, | ||
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix | UT, | ||
magma_c_matrix * | L_new, | ||
magma_c_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_c_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_c_matrix triu( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_c_matrix Current lower triangular factor. |
[in] | U | magma_c_matrix Current upper triangular factor transposed. |
[in] | UR | magma_c_matrix Current upper triangular factor - col-pointer and col-list. |
[in,out] | LU_new | magma_c_matrix* List of candidates for L in COO format. |
[in,out] | LU_new | magma_c_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_candidates_linkedlist | ( | magma_c_matrix | L0, |
magma_c_matrix | U0, | ||
magma_c_matrix | L, | ||
magma_c_matrix | U, | ||
magma_c_matrix | UR, | ||
magma_c_matrix * | L_new, | ||
magma_c_matrix * | U_new, | ||
magma_queue_t | queue ) |
magma_int_t magma_cparilut_rm_thrs | ( | float * | thrs, |
magma_int_t * | num_rm, | ||
magma_c_matrix * | LU, | ||
magma_c_matrix * | LU_new, | ||
magma_index_t * | rm_loc, | ||
magma_queue_t | queue ) |
This routine removes matrix entries from the structure that are smaller than the threshold.
It only counts the elements deleted, does not save the locations.
[out] | thrs | magmaFloatComplex* Thrshold for removing elements. |
[out] | num_rm | magma_int_t* Number of Elements that have been removed. |
[in,out] | LU | magma_c_matrix* Current ILU approximation where the identified smallest components are deleted. |
[in,out] | LUC | magma_c_matrix* Corresponding col-list. |
[in,out] | LU_new | magma_c_matrix* List of candidates in COO format. |
[out] | rm_loc | magma_index_t* List containing the locations of the elements deleted. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_count | ( | magma_c_matrix | L, |
magma_int_t * | num, | ||
magma_queue_t | queue ) |
This is a helper routine counting elements in a matrix in unordered Magma_CSRLIST format.
[in] | L | magma_c_matrix* Matrix in Magm_CSRLIST format |
[out] | num | magma_index_t* Number of elements counted. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_randlist | ( | magma_c_matrix * | LU, |
magma_queue_t | queue ) |
magma_int_t magma_cparilut_select_candidates_L | ( | magma_int_t * | num_rm, |
magma_index_t * | rm_loc, | ||
magma_c_matrix * | L_new, | ||
magma_queue_t | queue ) |
Screens the new candidates for multiple elements in the same row.
We allow for at most one new element per row. This changes the algorithm, but pays off in performance.
[in] | num_rmL | magma_int_t Number of Elements that are replaced. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced. |
[in] | rm_loc | magma_int_t* Number of Elements that are replaced by distinct threads. |
[in] | L_new | magma_c_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | U_new | magma_c_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_select_candidates_U | ( | magma_int_t * | num_rm, |
magma_index_t * | rm_loc, | ||
magma_c_matrix * | L_new, | ||
magma_queue_t | queue ) |
Screens the new candidates for multiple elements in the same row.
We allow for at most one new element per row. This changes the algorithm, but pays off in performance.
[in] | num_rmL | magma_int_t Number of Elements that are replaced. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced. |
[in] | rm_loc | magma_int_t* Number of Elements that are replaced by distinct threads. |
[in] | L_new | magma_c_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | U_new | magma_c_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cparilut_preselect | ( | magma_int_t | order, |
magma_c_matrix * | A, | ||
magma_c_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==0 lower triangular order==1 upper triangular |
[in] | A | magma_c_matrix* Matrix where elements are removed. |
[out] | oneA | magma_c_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cpreselect_gpu | ( | magma_int_t | order, |
magma_c_matrix * | A, | ||
magma_c_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==0 lower triangular order==1 upper triangular |
[in] | A | magma_c_matrix* Matrix where elements are removed. |
[out] | oneA | magma_c_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_csampleselect | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
magmaFloatComplex * | val, | ||
float * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest.
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | magmaFloatComplex array containing the values |
[out] | thrs | float* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_csampleselect_approx | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
magmaFloatComplex * | val, | ||
float * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest.
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | magmaFloatComplex array containing the values |
[out] | thrs | float* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_csampleselect_nodp | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
magmaFloatComplex * | val, | ||
float * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest.
This routine does not use dynamic parallelism (DP)
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | magmaFloatComplex array containing the values |
[out] | thrs | float* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_csampleselect_approx_nodp | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
magmaFloatComplex * | val, | ||
float * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest.
This routine does not use dynamic parallelism (DP)
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | magmaFloatComplex array containing the values |
[out] | thrs | float* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmprepare_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_c_matrix | L, | ||
magma_c_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaFloatComplex * | trisystems, | ||
magmaFloatComplex * | rhs, | ||
magma_queue_t | queue ) |
Takes a sparse matrix and generates an array containing the sizes of the different systems an array containing the indices with the locations in the sparse matrix where the data comes from and goes back to an array containing all the sparse triangular systems.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_c_matrix Matrix in CSR format |
[in] | LC | magma_c_matrix same matrix, also CSR, but col-major |
[in,out] | sizes | magma_int_t* Number of Elements that are replaced. |
[in,out] | locations | magma_int_t* Array indicating the locations. |
[in,out] | trisystems | magmaFloatComplex* trisystems |
[in,out] | rhs | magmaFloatComplex* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmtrisolve_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_c_matrix | L, | ||
magma_c_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaFloatComplex * | trisystems, | ||
magmaFloatComplex * | rhs, | ||
magma_queue_t | queue ) |
Does all triangular solves.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_c_matrix Matrix in CSR format |
[in] | LC | magma_c_matrix same matrix, also CSR, but col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | magmaFloatComplex* trisystems |
[out] | rhs | magmaFloatComplex* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmbackinsert_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_c_matrix * | M, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaFloatComplex * | trisystems, | ||
magmaFloatComplex * | rhs, | ||
magma_queue_t | queue ) |
Inserts the values into the preconditioner matrix.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in,out] | M | magma_c_matrix* SPAI preconditioner CSR col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | magmaFloatComplex* trisystems |
[out] | rhs | magmaFloatComplex* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmprepare_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_c_matrix | L, | ||
magma_c_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaFloatComplex * | trisystems, | ||
magmaFloatComplex * | rhs, | ||
magma_queue_t | queue ) |
magma_int_t magma_cmtrisolve_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_c_matrix | L, | ||
magma_c_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaFloatComplex * | trisystems, | ||
magmaFloatComplex * | rhs, | ||
magma_queue_t | queue ) |
Does all triangular solves.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_c_matrix Matrix in CSR format |
[in] | LC | magma_c_matrix same matrix, also CSR, but col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | magmaFloatComplex* trisystems |
[out] | rhs | magmaFloatComplex* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmbackinsert_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_c_matrix * | M, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaFloatComplex * | trisystems, | ||
magmaFloatComplex * | rhs, | ||
magma_queue_t | queue ) |
magma_int_t magma_ciluisaisetup_lower | ( | magma_c_matrix | L, |
magma_c_matrix | S, | ||
magma_c_matrix * | ISAIL, | ||
magma_queue_t | queue ) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves.
This routine only handles the lower triangular part. The return value is 0 in case of success, and Magma_CUSOLVE if the pattern is too large to be handled.
[in] | L | magma_c_matrix lower triangular factor |
[in] | S | magma_c_matrix pattern for the ISAI preconditioner for L |
[out] | ISAIL | magma_c_matrix* ISAI preconditioner for L |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ciluisaisetup_upper | ( | magma_c_matrix | U, |
magma_c_matrix | S, | ||
magma_c_matrix * | ISAIU, | ||
magma_queue_t | queue ) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves.
This routine only handles the upper triangular part. The return value is 0 in case of success, and Magma_CUSOLVE if the pattern is too large to be handled.
[in] | U | magma_c_matrix lower triangular factor |
[in] | S | magma_c_matrix pattern for the ISAI preconditioner for U |
[out] | ISAIU | magma_c_matrix* ISAI preconditioner for U |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cicisaisetup | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_cisai_l | ( | magma_c_matrix | b, |
magma_c_matrix * | x, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Left-hand-side application of ISAI preconditioner.
[in] | b | magma_c_matrix input RHS b |
[in,out] | x | magma_c_matrix solution x |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cisai_r | ( | magma_c_matrix | b, |
magma_c_matrix * | x, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Right-hand-side application of ISAI preconditioner.
[in] | b | magma_c_matrix input RHS b |
[in,out] | x | magma_c_matrix solution x |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cisai_l_t | ( | magma_c_matrix | b, |
magma_c_matrix * | x, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Left-hand-side application of ISAI preconditioner.
Transpose.
[in] | b | magma_c_matrix input RHS b |
[in,out] | x | magma_c_matrix solution x |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cisai_r_t | ( | magma_c_matrix | b, |
magma_c_matrix * | x, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Right-hand-side application of ISAI preconditioner.
Transpose.
[in] | b | magma_c_matrix input RHS b |
[in,out] | x | magma_c_matrix solution x |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmiluisai_sizecheck | ( | magma_c_matrix | A, |
magma_index_t | batchsize, | ||
magma_index_t * | maxsize, | ||
magma_queue_t | queue ) |
magma_int_t magma_cgeisai_maxblock | ( | magma_c_matrix | L, |
magma_c_matrix * | MT, | ||
magma_queue_t | queue ) |
This routine maximizes the pattern for the ISAI preconditioner.
Precisely, it computes L, L^2, L^3, L^4, L^5 and then selects the columns of M_L such that the nonzer-per-column are the lower max than the implementation-specific limit (32).
The input is the original matrix (row-major) The output is already col-major.
[in,out] | L | magma_c_matrix Incomplete factor. |
[in,out] | MT | magma_c_matrix* SPAI preconditioner structure, CSR col-major. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cisai_generator_regs | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_c_matrix | L, | ||
magma_c_matrix * | M, | ||
magma_queue_t | queue ) |
This routine is designet to combine all kernels into one.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_c_matrix triangular factor for which the ISAI matrix is computed. Col-Major CSR storage. |
[in,out] | M | magma_c_matrix* SPAI preconditioner CSR col-major |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmsupernodal | ( | magma_int_t * | max_bs, |
magma_c_matrix | A, | ||
magma_c_matrix * | S, | ||
magma_queue_t | queue ) |
Generates a block-diagonal sparsity pattern with block-size bs.
[in,out] | max_bs | magma_int_t* Size of the largest diagonal block. |
[in] | A | magma_c_matrix System matrix. |
[in,out] | S | magma_c_matrix* Generated sparsity pattern matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmvarsizeblockstruct | ( | magma_int_t | n, |
magma_int_t * | bs, | ||
magma_int_t | bsl, | ||
magma_uplo_t | uplotype, | ||
magma_c_matrix * | A, | ||
magma_queue_t | queue ) |
Generates a block-diagonal sparsity pattern with variable block-size.
[in] | n | magma_int_t Size of the matrix. |
[in] | bs | magma_int_t* Vector containing the size of the diagonal blocks. |
[in] | bsl | magma_int_t Size of the vector containing the block sizes. |
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in,out] | A | magma_c_matrix* Generated sparsity pattern matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ctfqmr_unrolled | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_c_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a complex matrix A.
This is a GPU implementation of the transpose-free Quasi-Minimal Residual method (TFQMR).
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix RHS b |
[in,out] | x | magma_c_matrix* solution approximation |
[in,out] | solver_par | magma_c_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbicgstab_merge2 | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_c_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a general matrix.
This is a GPU implementation of the Biconjugate Gradient Stabilized method. The difference to magma_cbicgstab is that we use specifically designed kernels merging multiple operations into one kernel.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix RHS b |
[in,out] | x | magma_c_matrix* solution approximation |
[in,out] | solver_par | magma_c_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbicgstab_merge3 | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_c_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a general matrix.
This is a GPU implementation of the Biconjugate Gradient Stabilized method. The difference to magma_cbicgstab is that we use specifically designed kernels merging multiple operations into one kernel.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix RHS b |
[in,out] | x | magma_c_matrix* solution approximation |
[in,out] | solver_par | magma_c_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cjacobidomainoverlap | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_c_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a complex Hermitian N-by-N positive definite matrix A.
This is a GPU implementation of the Jacobi method allowing for domain overlap.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix RHS b |
[in,out] | x | magma_c_matrix* solution approximation |
[in,out] | solver_par | magma_c_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbaiter | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_c_solver_par * | solver_par, | ||
magma_c_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix RHS b |
[in,out] | x | magma_c_matrix* solution approximation |
[in,out] | solver_par | magma_c_solver_par* solver parameters |
[in] | precond_par | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbaiter_overlap | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_c_solver_par * | solver_par, | ||
magma_c_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU.
It used restricted additive Schwarz overlap in top-down direction.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix RHS b |
[in,out] | x | magma_c_matrix* solution approximation |
[in,out] | solver_par | magma_c_solver_par* solver parameters |
[in] | precond_par | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cftjacobicontractions | ( | magma_c_matrix | xkm2, |
magma_c_matrix | xkm1, | ||
magma_c_matrix | xk, | ||
magma_c_matrix * | z, | ||
magma_c_matrix * | c, | ||
magma_queue_t | queue ) |
Computes the contraction coefficients c_i:
c_i = z_i^{k-1} / z_i^{k}
= | x_i^{k-1} - x_i^{k-2} | / | x_i^{k} - x_i^{k-1} |
[in] | xkm2 | magma_c_matrix vector x^{k-2} |
[in] | xkm1 | magma_c_matrix vector x^{k-2} |
[in] | xk | magma_c_matrix vector x^{k-2} |
[out] | z | magma_c_matrix* ratio |
[out] | c | magma_c_matrix* contraction coefficients |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cftjacobiupdatecheck | ( | float | delta, |
magma_c_matrix * | xold, | ||
magma_c_matrix * | xnew, | ||
magma_c_matrix * | zprev, | ||
magma_c_matrix | c, | ||
magma_int_t * | flag_t, | ||
magma_int_t * | flag_fp, | ||
magma_queue_t | queue ) |
Checks the Jacobi updates accorting to the condition in the ScaLA'15 paper.
[in] | delta | float threshold |
[in,out] | xold | magma_c_matrix* vector xold |
[in,out] | xnew | magma_c_matrix* vector xnew |
[in,out] | zprev | magma_c_matrix* vector z = | x_k-1 - x_k | |
[in] | c | magma_c_matrix contraction coefficients |
[in,out] | flag_t | magma_int_t threshold condition |
[in,out] | flag_fp | magma_int_t false positive condition |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_citerref | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_c_solver_par * | solver_par, | ||
magma_c_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a complex Hermitian N-by-N positive definite matrix A.
This is a GPU implementation of the Iterative Refinement method. The inner solver is passed via the preconditioner argument.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix RHS b |
[in,out] | x | magma_c_matrix* solution approximation |
[in,out] | solver_par | magma_c_solver_par* solver parameters |
[in,out] | precond_par | magma_c_preconditioner* inner solver |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cjacobiiter_sys | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_matrix | d, | ||
magma_c_matrix | t, | ||
magma_c_matrix * | x, | ||
magma_c_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.
This routine takes the system matrix A and the RHS b as input.
[in] | A | magma_c_matrix input matrix M = D^(-1) * (L+U) |
[in] | b | magma_c_matrix input RHS b |
[in] | d | magma_c_matrix input matrix diagonal elements diag(A) |
[in] | t | magma_c_matrix temporary vector |
[in,out] | x | magma_c_matrix* iteration vector x |
[in,out] | solver_par | magma_c_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cftjacobi | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_c_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.
This routine takes the system matrix A and the RHS b as input. This is the fault-tolerant version of Jacobi according to ScalLA'15.
[in] | A | magma_c_matrix input matrix M = D^(-1) * (L+U) |
[in] | b | magma_c_matrix input RHS b |
[in,out] | x | magma_c_matrix* iteration vector x |
[in,out] | solver_par | magma_c_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cilut_saad | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_cilut_saad_apply | ( | magma_c_matrix | b, |
magma_c_matrix * | x, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_ccustomilusetup | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Reads in an Incomplete LU preconditioner.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ccustomicsetup | ( | magma_c_matrix | A, |
magma_c_matrix | b, | ||
magma_c_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Reads in an Incomplete Cholesky preconditioner.
[in] | A | magma_c_matrix input matrix A |
[in] | b | magma_c_matrix input RHS b |
[in,out] | precond | magma_c_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbajac_csr | ( | magma_int_t | localiters, |
magma_c_matrix | D, | ||
magma_c_matrix | R, | ||
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_queue_t | queue ) |
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.
Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.
[in] | localiters | magma_int_t number of local Jacobi-like updates |
[in] | D | magma_c_matrix input matrix with diagonal blocks |
[in] | R | magma_c_matrix input matrix with non-diagonal parts |
[in] | b | magma_c_matrix RHS |
[in] | x | magma_c_matrix* iterate/solution |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbajac_csr_overlap | ( | magma_int_t | localiters, |
magma_int_t | matrices, | ||
magma_int_t | overlap, | ||
magma_c_matrix * | D, | ||
magma_c_matrix * | R, | ||
magma_c_matrix | b, | ||
magma_c_matrix * | x, | ||
magma_queue_t | queue ) |
This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block.
Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.
[in] | localiters | magma_int_t number of local Jacobi-like updates |
[in] | matrices | magma_int_t number of sub-matrices |
[in] | overlap | magma_int_t size of the overlap |
[in] | D | magma_c_matrix* set of matrices with diagonal blocks |
[in] | R | magma_c_matrix* set of matrices with non-diagonal parts |
[in] | b | magma_c_matrix RHS |
[in] | x | magma_c_matrix* iterate/solution |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmlumerge | ( | magma_c_matrix | L, |
magma_c_matrix | U, | ||
magma_c_matrix * | A, | ||
magma_queue_t | queue ) |
Takes an strictly lower triangular matrix L and an upper triangular matrix U and merges them into a matrix A containing the upper and lower triangular parts.
[in] | L | magma_c_matrix input strictly lower triangular matrix L |
[in] | U | magma_c_matrix input upper triangular matrix U |
[out] | A | magma_c_matrix* output matrix |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgeaxpy | ( | magmaFloatComplex | alpha, |
magma_c_matrix | X, | ||
magmaFloatComplex | beta, | ||
magma_c_matrix * | Y, | ||
magma_queue_t | queue ) |
This routine computes Y = alpha * X + beta * Y on the GPU.
The input format is magma_c_matrix. It can handle both, dense matrix (vector block) and CSR matrices. For the latter, it interfaces the cuSPARSE library.
[in] | alpha | magmaFloatComplex scalar multiplier. |
[in] | X | magma_c_matrix input/output matrix Y. |
[in] | beta | magmaFloatComplex scalar multiplier. |
[in,out] | Y | magma_c_matrix* input matrix X. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgecsrreimsplit | ( | magma_c_matrix | A, |
magma_c_matrix * | ReA, | ||
magma_c_matrix * | ImA, | ||
magma_queue_t | queue ) |
This routine takes an input matrix A in CSR format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A.
The output matrices are allocated within the routine.
[in] | A | magma_c_matrix input matrix A. |
[out] | ReA | magma_c_matrix* output matrix contaning real contributions. |
[out] | ImA | magma_c_matrix* output matrix contaning complex contributions. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgedensereimsplit | ( | magma_c_matrix | A, |
magma_c_matrix * | ReA, | ||
magma_c_matrix * | ImA, | ||
magma_queue_t | queue ) |
This routine takes an input matrix A in DENSE format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A.
The output matrices are allocated within the routine.
[in] | A | magma_c_matrix input matrix A. |
[out] | ReA | magma_c_matrix* output matrix contaning real contributions. |
[out] | ImA | magma_c_matrix* output matrix contaning complex contributions. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cgecsr5mv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | p, | ||
magmaFloatComplex | alpha, | ||
magma_int_t | sigma, | ||
magma_int_t | bit_y_offset, | ||
magma_int_t | bit_scansum_offset, | ||
magma_int_t | num_packet, | ||
magmaUIndex_ptr | dtile_ptr, | ||
magmaUIndex_ptr | dtile_desc, | ||
magmaIndex_ptr | dtile_desc_offset_ptr, | ||
magmaIndex_ptr | dtile_desc_offset, | ||
magmaFloatComplex_ptr | dcalibrator, | ||
magma_int_t | tail_tile_start, | ||
magmaFloatComplex_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaFloatComplex_ptr | dx, | ||
magmaFloatComplex | beta, | ||
magmaFloatComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
The input format is CSR5 (val (tile-wise column-major), row_pointer, col (tile-wise column-major), tile_pointer, tile_desc).
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | p | magma_int_t number of tiles in A |
[in] | alpha | magmaFloatComplex scalar multiplier |
[in] | sigma | magma_int_t sigma in A in CSR5 |
[in] | bit_y_offset | magma_int_t bit_y_offset in A in CSR5 |
[in] | bit_scansum_offset | magma_int_t bit_scansum_offset in A in CSR5 |
[in] | num_packet | magma_int_t num_packet in A in CSR5 |
[in] | dtile_ptr | magmaUIndex_ptr tilepointer of A in CSR5 |
[in] | dtile_desc | magmaUIndex_ptr tiledescriptor of A in CSR5 |
[in] | dtile_desc_offset_ptr | magmaIndex_ptr tiledescriptor_offsetpointer of A in CSR5 |
[in] | dtile_desc_offset | magmaIndex_ptr tiledescriptor_offsetpointer of A in CSR5 |
[in] | dcalibrator | magmaFloatComplex_ptr calibrator of A in CSR5 |
[in] | tail_tile_start | magma_int_t start of the last tile in A |
[in] | dval | magmaFloatComplex_ptr array containing values of A in CSR |
[in] | dval | magmaFloatComplex_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaFloatComplex_ptr input vector x |
[in] | beta | magmaFloatComplex scalar multiplier |
[out] | dy | magmaFloatComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ccopyscale | ( | magma_int_t | n, |
magma_int_t | k, | ||
magmaFloatComplex_ptr | r, | ||
magmaFloatComplex_ptr | v, | ||
magmaFloatComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Computes the correction term of the pipelined GMRES according to P.
Ghysels and scales and copies the new search direction
Returns the vector v = r/ ( skp[k] - (sum_i=1^k skp[i]^2) ) .
[in] | n | int length of v_i |
[in] | k | int |
[in] | r | magmaFloatComplex_ptr vector of length n |
[in] | v | magmaFloatComplex_ptr vector of length n |
[in] | skp | magmaFloatComplex_ptr array of parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_scnrm2scale | ( | magma_int_t | m, |
magmaFloatComplex_ptr | dr, | ||
magma_int_t | lddr, | ||
magmaFloatComplex * | drnorm, | ||
magma_queue_t | queue ) |
magma_int_t magma_cjacobispmvupdate_bw | ( | magma_int_t | maxiter, |
magma_c_matrix | A, | ||
magma_c_matrix | t, | ||
magma_c_matrix | b, | ||
magma_c_matrix | d, | ||
magma_c_matrix * | x, | ||
magma_queue_t | queue ) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d.
*(b-Ax) This kernel processes the thread blocks in reversed order.
[in] | maxiter | magma_int_t number of Jacobi iterations |
[in] | A | magma_c_matrix system matrix |
[in] | t | magma_c_matrix workspace |
[in] | b | magma_c_matrix RHS b |
[in] | d | magma_c_matrix vector with diagonal entries |
[out] | x | magma_c_matrix* iteration vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cjacobispmvupdateselect | ( | magma_int_t | maxiter, |
magma_int_t | num_updates, | ||
magma_index_t * | indices, | ||
magma_c_matrix | A, | ||
magma_c_matrix | t, | ||
magma_c_matrix | b, | ||
magma_c_matrix | d, | ||
magma_c_matrix | tmp, | ||
magma_c_matrix * | x, | ||
magma_queue_t | queue ) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d.
*(b-Ax)
This kernel allows for overlapping domains: the indices-array contains the locations that are updated. Locations may be repeated to simulate overlapping domains.
[in] | maxiter | magma_int_t number of Jacobi iterations |
[in] | num_updates | magma_int_t number of updates - length of the indices array |
[in] | indices | magma_index_t* indices, which entries of x to update |
[in] | A | magma_c_matrix system matrix |
[in] | t | magma_c_matrix workspace |
[in] | b | magma_c_matrix RHS b |
[in] | d | magma_c_matrix vector with diagonal entries |
[in] | tmp | magma_c_matrix workspace |
[out] | x | magma_c_matrix* iteration vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cmergeblockkrylov | ( | magma_int_t | num_rows, |
magma_int_t | num_cols, | ||
magmaFloatComplex_ptr | alpha, | ||
magmaFloatComplex_ptr | p, | ||
magmaFloatComplex_ptr | x, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
v = y / rho y = y / rho w = wt / psi z = z / psi
[in] | num_rows | magma_int_t dimension m |
[in] | num_cols | magma_int_t dimension n |
[in] | alpha | magmaFloatComplex_ptr matrix containing all SKP |
[in] | p | magmaFloatComplex_ptr search directions |
[in,out] | x | magmaFloatComplex_ptr approximation vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbicgmerge1 | ( | magma_int_t | n, |
magmaFloatComplex_ptr | skp, | ||
magmaFloatComplex_ptr | v, | ||
magmaFloatComplex_ptr | r, | ||
magmaFloatComplex_ptr | p, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
p = beta*p p = p-omega*beta*v p = p+r
-> p = r + beta * ( p - omega * v )
[in] | n | int dimension n |
[in] | skp | magmaFloatComplex_ptr set of scalar parameters |
[in] | v | magmaFloatComplex_ptr input vector v |
[in] | r | magmaFloatComplex_ptr input vector r |
[in,out] | p | magmaFloatComplex_ptr input/output vector p |
[in] | queue | magma_queue_t queue to execute in. |
magma_int_t magma_cbicgmerge2 | ( | magma_int_t | n, |
magmaFloatComplex_ptr | skp, | ||
magmaFloatComplex_ptr | r, | ||
magmaFloatComplex_ptr | v, | ||
magmaFloatComplex_ptr | s, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
s=r s=s-alpha*v
-> s = r - alpha * v
[in] | n | int dimension n |
[in] | skp | magmaFloatComplex_ptr set of scalar parameters |
[in] | r | magmaFloatComplex_ptr input vector r |
[in] | v | magmaFloatComplex_ptr input vector v |
[out] | s | magmaFloatComplex_ptr output vector s |
[in] | queue | magma_queue_t queue to execute in. |
magma_int_t magma_cbicgmerge3 | ( | magma_int_t | n, |
magmaFloatComplex_ptr | skp, | ||
magmaFloatComplex_ptr | p, | ||
magmaFloatComplex_ptr | s, | ||
magmaFloatComplex_ptr | t, | ||
magmaFloatComplex_ptr | x, | ||
magmaFloatComplex_ptr | r, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
x=x+alpha*p x=x+omega*s r=s r=r-omega*t
-> x = x + alpha * p + omega * s -> r = s - omega * t
[in] | n | int dimension n |
[in] | skp | magmaFloatComplex_ptr set of scalar parameters |
[in] | p | magmaFloatComplex_ptr input p |
[in] | s | magmaFloatComplex_ptr input s |
[in] | t | magmaFloatComplex_ptr input t |
[in,out] | x | magmaFloatComplex_ptr input/output x |
[in,out] | r | magmaFloatComplex_ptr input/output r |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbicgmerge4 | ( | magma_int_t | type, |
magmaFloatComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Performs some parameter operations for the BiCGSTAB with scalars on GPU.
[in] | type | int kernel type |
[in,out] | skp | magmaFloatComplex_ptr vector with parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbicgmerge_spmv1 | ( | magma_c_matrix | A, |
magmaFloatComplex_ptr | d1, | ||
magmaFloatComplex_ptr | d2, | ||
magmaFloatComplex_ptr | dp, | ||
magmaFloatComplex_ptr | dr, | ||
magmaFloatComplex_ptr | dv, | ||
magmaFloatComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the first SpmV using CSR with the dot product and the computation of alpha.
[in] | A | magma_c_matrix system matrix |
[in] | d1 | magmaFloatComplex_ptr temporary vector |
[in] | d2 | magmaFloatComplex_ptr temporary vector |
[in] | dp | magmaFloatComplex_ptr input vector p |
[in] | dr | magmaFloatComplex_ptr input vector r |
[in] | dv | magmaFloatComplex_ptr output vector v |
[in,out] | skp | magmaFloatComplex_ptr array for parameters ( skp[0]=alpha ) |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbicgmerge_spmv2 | ( | magma_c_matrix | A, |
magmaFloatComplex_ptr | d1, | ||
magmaFloatComplex_ptr | d2, | ||
magmaFloatComplex_ptr | ds, | ||
magmaFloatComplex_ptr | dt, | ||
magmaFloatComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the second SpmV using CSR with the dot product and the computation of omega.
[in] | A | magma_c_matrix input matrix |
[in] | d1 | magmaFloatComplex_ptr temporary vector |
[in] | d2 | magmaFloatComplex_ptr temporary vector |
[in] | ds | magmaFloatComplex_ptr input vector s |
[in] | dt | magmaFloatComplex_ptr output vector t |
[in,out] | skp | magmaFloatComplex_ptr array for parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbicgmerge_xrbeta | ( | magma_int_t | n, |
magmaFloatComplex_ptr | d1, | ||
magmaFloatComplex_ptr | d2, | ||
magmaFloatComplex_ptr | rr, | ||
magmaFloatComplex_ptr | r, | ||
magmaFloatComplex_ptr | p, | ||
magmaFloatComplex_ptr | s, | ||
magmaFloatComplex_ptr | t, | ||
magmaFloatComplex_ptr | x, | ||
magmaFloatComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the second SpmV using CSR with the dot product and the computation of omega.
[in] | n | int dimension n |
[in] | d1 | magmaFloatComplex_ptr temporary vector |
[in] | d2 | magmaFloatComplex_ptr temporary vector |
[in] | rr | magmaFloatComplex_ptr input vector rr |
[in] | r | magmaFloatComplex_ptr input/output vector r |
[in] | p | magmaFloatComplex_ptr input vector p |
[in] | s | magmaFloatComplex_ptr input vector s |
[in] | t | magmaFloatComplex_ptr input vector t |
[out] | x | magmaFloatComplex_ptr output vector x |
[in] | skp | magmaFloatComplex_ptr array for parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_cbcsrswp | ( | magma_int_t | n, |
magma_int_t | size_b, | ||
magma_int_t * | ipiv, | ||
magmaFloatComplex_ptr | dx, | ||
magma_queue_t | queue ) |
magma_int_t magma_cbcsrtrsv | ( | magma_uplo_t | uplo, |
magma_int_t | r_blocks, | ||
magma_int_t | c_blocks, | ||
magma_int_t | size_b, | ||
magmaFloatComplex_ptr | dA, | ||
magma_index_t * | blockinfo, | ||
magmaFloatComplex_ptr | dx, | ||
magma_queue_t | queue ) |
magma_int_t magma_cbcsrvalcpy | ( | magma_int_t | size_b, |
magma_int_t | num_blocks, | ||
magma_int_t | num_zero_blocks, | ||
magmaFloatComplex_ptr * | dAval, | ||
magmaFloatComplex_ptr * | dBval, | ||
magmaFloatComplex_ptr * | dBval2, | ||
magma_queue_t | queue ) |
magma_int_t magma_cbcsrluegemm | ( | magma_int_t | size_b, |
magma_int_t | num_block_rows, | ||
magma_int_t | kblocks, | ||
magmaFloatComplex_ptr * | dA, | ||
magmaFloatComplex_ptr * | dB, | ||
magmaFloatComplex_ptr * | dC, | ||
magma_queue_t | queue ) |
magma_int_t magma_cbcsrlupivloc | ( | magma_int_t | size_b, |
magma_int_t | kblocks, | ||
magmaFloatComplex_ptr * | dA, | ||
magma_int_t * | ipiv, | ||
magma_queue_t | queue ) |
magma_int_t magma_cbcsrblockinfo5 | ( | magma_int_t | lustep, |
magma_int_t | num_blocks, | ||
magma_int_t | c_blocks, | ||
magma_int_t | size_b, | ||
magma_index_t * | blockinfo, | ||
magmaFloatComplex_ptr | dval, | ||
magmaFloatComplex_ptr * | AII, | ||
magma_queue_t | queue ) |
magma_int_t magma_cthrsholdselect | ( | magma_int_t | sampling, |
magma_int_t | total_size, | ||
magma_int_t | subset_size, | ||
magmaFloatComplex * | val, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_dorderstatistics | ( | double * | val, |
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | r, | ||
double * | element, | ||
magma_queue_t | queue ) |
Identifies the kth smallest/largest element in an array.
[in,out] | val | double* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | double* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dorderstatistics_inc | ( | double * | val, |
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | inc, | ||
magma_int_t | r, | ||
double * | element, | ||
magma_queue_t | queue ) |
Approximates the k-th smallest element in an array by using order-statistics with step-size inc.
[in,out] | val | double* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | inc | magma_int_t Stepsize in the approximation. |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | double* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmorderstatistics | ( | double * | val, |
magma_index_t * | col, | ||
magma_index_t * | row, | ||
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | r, | ||
double * | element, | ||
magma_queue_t | queue ) |
Identifies the kth smallest/largest element in an array and reorders such that these elements come to the front.
The related arrays col and row are also reordered.
[in,out] | val | double* Target array, will be modified during operation. |
[in,out] | col | magma_index_t* Target array, will be modified during operation. |
[in,out] | row | magma_index_t* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | double* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dpartition | ( | double * | a, |
magma_int_t | size, | ||
magma_int_t | pivot, | ||
magma_queue_t | queue ) |
magma_int_t magma_dmedian5 | ( | double * | a, |
magma_queue_t | queue ) |
magma_int_t magma_dselect | ( | double * | a, |
magma_int_t | size, | ||
magma_int_t | k, | ||
magma_queue_t | queue ) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm.
Derrick Coetzee, webma.nosp@m.ster.nosp@m.@moon.nosp@m.flar.nosp@m.e.com January 22, 2004 http://moonflare.com/code/select/select.pdf
[in,out] | a | double* array to select from |
[in] | size | magma_int_t size of array |
[in] | k | magma_int_t k-th smallest element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dselectrandom | ( | double * | a, |
magma_int_t | size, | ||
magma_int_t | k, | ||
magma_queue_t | queue ) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm.
Derrick Coetzee, webma.nosp@m.ster.nosp@m.@moon.nosp@m.flar.nosp@m.e.com January 22, 2004 http://moonflare.com/code/select/select.pdf
[in,out] | a | double* array to select from |
[in] | size | magma_int_t size of array |
[in] | k | magma_int_t k-th smallest element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ddomainoverlap | ( | magma_index_t | num_rows, |
magma_int_t * | num_indices, | ||
magma_index_t * | rowptr, | ||
magma_index_t * | colidx, | ||
magma_index_t * | x, | ||
magma_queue_t | queue ) |
Generates the update list.
[in] | x | magma_index_t* array to sort |
[in] | num_rows | magma_int_t number of rows in matrix |
[out] | num_indices | magma_int_t* number of indices in array |
[in] | rowptr | magma_index_t* rowpointer of matrix |
[in] | colidx | magma_index_t* colindices of matrix |
[in] | x | magma_index_t* array containing indices for domain overlap |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dvspread | ( | magma_d_matrix * | x, |
const char * | filename, | ||
magma_queue_t | queue ) |
Reads in a sparse vector-block stored in COO format.
[out] | x | magma_d_matrix * vector to read in |
[in] | filename | char* file where vector is stored |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ddiameter | ( | magma_d_matrix * | A, |
magma_queue_t | queue ) |
Computes the diameter of a sparse matrix and stores the value in diameter.
[in,out] | A | magma_d_matrix* sparse matrix |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilusetup | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the ILU preconditioner via the iterative ILU iteration.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilu_gpu | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an ILU(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the GPU implementation of the ParILU
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilu_cpu | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an ILU(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the CPU implementation of the ParILU
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparic_gpu | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an IC(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the GPU implementation of the ParIC
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparic_cpu | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an IC(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the CPU implementation of the ParIC
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparicsetup | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the IC preconditioner via the iterative IC iteration.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparicupdate | ( | magma_d_matrix | A, |
magma_d_preconditioner * | precond, | ||
magma_int_t | updates, | ||
magma_queue_t | queue ) |
Updates an existing preconditioner via additional iterative IC sweeps for previous factorization initial guess (PFIG).
See Anzt et al., Parallel Computing, 2015.
[in] | A | magma_d_matrix input matrix A, current target system |
[in] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | updates | magma_int_t number of updates |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dapplyiteric_l | ( | magma_d_matrix | b, |
magma_d_matrix * | x, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Performs the left triangular solves using the IC preconditioner via Jacobi.
[in] | b | magma_d_matrix RHS |
[out] | x | magma_d_matrix* vector to precondition |
[in] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dapplyiteric_r | ( | magma_d_matrix | b, |
magma_d_matrix * | x, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Performs the right triangular solves using the IC preconditioner via Jacobi.
[in] | b | magma_d_matrix RHS |
[out] | x | magma_d_matrix* vector to precondition |
[in] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilu_csr | ( | magma_d_matrix | A, |
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_queue_t | queue ) |
This routine iteratively computes an incomplete LU factorization.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1-16, 2015.
The input format of the system matrix is COO, the lower triangular factor L is stored in CSR, the upper triangular factor U is transposed, then also stored in CSR (equivalent to CSC format for the non-transposed U). Every component of L and U is handled by one thread.
[in] | A | magma_d_matrix input matrix A determing initial guess & processing order |
[in,out] | L | magma_d_matrix input/output matrix L containing the lower triangular factor |
[in,out] | U | magma_d_matrix input/output matrix U containing the upper triangular factor |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dpariluupdate | ( | magma_d_matrix | A, |
magma_d_preconditioner * | precond, | ||
magma_int_t | updates, | ||
magma_queue_t | queue ) |
Updates an existing preconditioner via additional iterative ILU sweeps for previous factorization initial guess (PFIG).
See Anzt et al., Parallel Computing, 2015.
[in] | A | magma_d_matrix input matrix A, current target system |
[in] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | updates | magma_int_t number of updates |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparic_csr | ( | magma_d_matrix | A, |
magma_d_matrix | A_CSR, | ||
magma_queue_t | queue ) |
This routine iteratively computes an incomplete LU factorization.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1-16, 2015.
The input format of the initial guess matrix A is Magma_CSRCOO, A_CSR is CSR or CSRCOO format.
[in] | A | magma_d_matrix input matrix A - initial guess (lower triangular) |
[in,out] | A_CSR | magma_d_matrix input/output matrix containing the IC approximation |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dnonlinres | ( | magma_d_matrix | A, |
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix * | LU, | ||
real_Double_t * | res, | ||
magma_queue_t | queue ) |
Computes the nonlinear residual A - LU and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_d_matrix input sparse matrix in CSR |
[in] | L | magma_d_matrix input sparse matrix in CSR |
[in] | U | magma_d_matrix input sparse matrix in CSR |
[out] | LU | magma_d_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* Frobenius norm of difference |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dilures | ( | magma_d_matrix | A, |
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix * | LU, | ||
real_Double_t * | res, | ||
real_Double_t * | nonlinres, | ||
magma_queue_t | queue ) |
Computes the ILU residual A - LU and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_d_matrix input sparse matrix in CSR |
[in] | L | magma_d_matrix input sparse matrix in CSR |
[in] | U | magma_d_matrix input sparse matrix in CSR |
[out] | LU | magma_d_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* Frobenius norm of difference |
[out] | nonlinres | real_Double_t* Frobenius norm of difference |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dicres | ( | magma_d_matrix | A, |
magma_d_matrix | C, | ||
magma_d_matrix | CT, | ||
magma_d_matrix * | LU, | ||
real_Double_t * | res, | ||
real_Double_t * | nonlinres, | ||
magma_queue_t | queue ) |
Computes the IC residual A - CC^T and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_d_matrix input sparse matrix in CSR |
[in] | C | magma_d_matrix input sparse matrix in CSR |
[in] | CT | magma_d_matrix input sparse matrix in CSR |
[in] | LU | magma_d_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* IC residual |
[out] | nonlinres | real_Double_t* nonlinear residual |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dinitguess | ( | magma_d_matrix | A, |
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
Computes an initial guess for the ParILU/ParIC.
[in] | A | magma_d_matrix sparse matrix in CSR |
[out] | L | magma_d_matrix* sparse matrix in CSR |
[out] | U | magma_d_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dinitrecursiveLU | ( | magma_d_matrix | A, |
magma_d_matrix * | B, | ||
magma_queue_t | queue ) |
Using the iterative approach of computing ILU factorizations with increasing fill-in, it takes the input matrix A, containing the approximate factors, ( L and U as well ) computes a matrix with one higher level of fill-in, inserts the original approximation as initial guess, and provides the factors L and U also filled with the scaled initial guess.
[in] | A | magma_d_matrix* sparse matrix in CSR |
[out] | B | magma_d_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmLdiagadd | ( | magma_d_matrix * | L, |
magma_queue_t | queue ) |
Checks for a lower triangular matrix whether it is strictly lower triangular and in the negative case adds a unit diagonal.
It does this in-place.
[in,out] | L | magma_d_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmatrix_cup | ( | magma_d_matrix | A, |
magma_d_matrix | B, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix \(U = A \cup B\).
If both matrices have a nonzero value in the same location, the value of A is used.
[in] | A | magma_d_matrix Input matrix 1. |
[in] | B | magma_d_matrix Input matrix 2. |
[out] | U | magma_d_matrix* Not a real matrix, but the list of all matrix entries included in either A or B. No duplicates. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmatrix_cup_gpu | ( | magma_d_matrix | A, |
magma_d_matrix | B, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix \(U = A \cup B\).
If both matrices have a nonzero value in the same location, the value of A is used.
This is the GPU version of the operation.
[in] | A | magma_d_matrix Input matrix 1. |
[in] | B | magma_d_matrix Input matrix 2. |
[out] | U | magma_d_matrix* \(U = A \cup B\). If both matrices have a nonzero value in the same location, the value of A is used. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmatrix_cap | ( | magma_d_matrix | A, |
magma_d_matrix | B, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix with entries being in both matrices: \(U = A \cap B\).
The values in U are all ones.
[in] | A | magma_d_matrix Input matrix 1. |
[in] | B | magma_d_matrix Input matrix 2. |
[out] | U | magma_d_matrix* Not a real matrix, but the list of all matrix entries included in both A and B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmatrix_negcap | ( | magma_d_matrix | A, |
magma_d_matrix | B, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a list of matrix entries being part of A but not of B.
U = A \ B The values of A are preserved.
[in] | A | magma_d_matrix Element part of this. |
[in,out] | B | magma_d_matrix Not part of this. |
[out] | U | magma_d_matrix* Not a real matrix, but the list of all matrix entries included in A not in B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmatrix_tril_negcap | ( | magma_d_matrix | A, |
magma_d_matrix | B, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a list of matrix entries being part of tril(A) but not of B.
U = tril(A) \ B The values of A are preserved.
[in] | A | magma_d_matrix Element part of this. |
[in,out] | B | magma_d_matrix Not part of this. |
[out] | U | magma_d_matrix* Not a real matrix, but the list of all matrix entries included in A not in B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmatrix_triu_negcap | ( | magma_d_matrix | A, |
magma_d_matrix | B, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix with entries being part of triu(A) but not of B.
U = triu(A) \ B The values of A are preserved.
[in] | A | magma_d_matrix Element part of this. |
[in] | B | magma_d_matrix Not part of this. |
[out] | U | magma_d_matrix* |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmatrix_abssum | ( | magma_d_matrix | A, |
double * | sum, | ||
magma_queue_t | queue ) |
Computes the sum of the absolute values in a matrix.
[in] | A | magma_d_matrix Element list/matrix. |
[out] | sum | double* Sum of the absolute values. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_thrsrm | ( | magma_int_t | order, |
magma_d_matrix * | A, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing.
[in] | order | magma_int_t order == 1: all elements smaller are discarded order == 0: all elements larger are discarded |
[in,out] | A | magma_d_matrix* Matrix where elements are removed. |
[in] | thrs | double* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_thrsrm_semilinked | ( | magma_d_matrix * | U, |
magma_d_matrix * | US, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller thrs from the matrix.
It only uses the linked list and skips the `‘removed’' elements
[in,out] | A | magma_d_matrix* Matrix where elements are removed. |
[in] | thrs | double* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_rmselected | ( | magma_d_matrix | R, |
magma_d_matrix * | A, | ||
magma_queue_t | queue ) |
Removes a selected list of elements from the matrix.
[in] | R | magma_d_matrix Matrix containing elements to be removed. |
[in,out] | A | magma_d_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_selectoneperrow | ( | magma_int_t | order, |
magma_d_matrix * | A, | ||
magma_d_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==1 -> largest order==0 -> smallest |
[in] | A | magma_d_matrix* Matrix where elements are removed. |
[out] | oneA | magma_d_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_selecttwoperrow | ( | magma_int_t | order, |
magma_d_matrix * | A, | ||
magma_d_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==1 -> largest order==0 -> smallest |
[in] | A | magma_d_matrix* Matrix where elements are removed. |
[out] | oneA | magma_d_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_selectoneperrowthrs_lower | ( | magma_d_matrix | L, |
magma_d_matrix | U, | ||
magma_d_matrix * | A, | ||
double | rtol, | ||
magma_d_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_d_matrix Current lower triangular factor. |
[in] | U | magma_d_matrix Current upper triangular factor. |
[in] | A | magma_d_matrix* All residuals in L. |
[in] | rtol | threshold rtol |
[out] | oneA | magma_d_matrix* at most one per row, if larger thrs. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_selectoneperrowthrs_upper | ( | magma_d_matrix | L, |
magma_d_matrix | U, | ||
magma_d_matrix * | A, | ||
double | rtol, | ||
magma_d_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_d_matrix Current lower triangular factor. |
[in] | U | magma_d_matrix Current upper triangular factor. |
[in] | A | magma_d_matrix* All residuals in L. |
[in] | rtol | threshold rtol |
[out] | oneA | magma_d_matrix* at most one per row, if larger thrs. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_selectonepercol | ( | magma_int_t | order, |
magma_d_matrix * | A, | ||
magma_d_matrix * | oneA, | ||
magma_queue_t | queue ) |
magma_int_t magma_dparilut_transpose_select_one | ( | magma_d_matrix | A, |
magma_d_matrix * | B, | ||
magma_queue_t | queue ) |
This is a special routine with very limited scope.
For a set of fill-in candidates in row-major format, it transposes the a submatrix, i.e. the submatrix consisting of the largest element in every column. This function is only useful for delta<=1.
[in] | A | magma_d_matrix Matrix to transpose. |
[out] | B | magma_d_matrix* Transposed matrix containing only largest elements in each col. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_insert_LU | ( | magma_int_t | num_rm, |
magma_index_t * | rm_loc, | ||
magma_index_t * | rm_loc2, | ||
magma_d_matrix * | LU_new, | ||
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
magma_int_t magma_dparilut_set_thrs | ( | magma_int_t | num_rm, |
magma_d_matrix * | LU, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_dparilut_set_approx_thrs | ( | magma_int_t | num_rm, |
magma_d_matrix * | LU, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_d_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | double* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_set_thrs_randomselect | ( | magma_int_t | num_rm, |
magma_d_matrix * | LU, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_d_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | double* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_set_thrs_randomselect_approx | ( | magma_int_t | num_rm, |
magma_d_matrix * | LU, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_d_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | double* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_set_thrs_randomselect_factors | ( | magma_int_t | num_rm, |
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_d_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | double* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_set_exact_thrs | ( | magma_int_t | num_rm, |
magma_d_matrix * | LU, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine provides the exact threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_d_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | double* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_set_approx_thrs_inc | ( | magma_int_t | num_rm, |
magma_d_matrix * | LU, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine provides the exact threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_d_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | double* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_LU_approx_thrs | ( | magma_int_t | num_rm, |
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_dparilut_reorder | ( | magma_d_matrix * | LU, |
magma_queue_t | queue ) |
This routine reorders the matrix (inplace) for easier access.
[in] | LU | magma_d_matrix* Current ILU approximation. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparict_sweep | ( | magma_d_matrix * | A, |
magma_d_matrix * | LU, | ||
magma_queue_t | queue ) |
magma_int_t magma_dparilut_zero | ( | magma_d_matrix * | A, |
magma_queue_t | queue ) |
magma_int_t magma_dparilu_sweep | ( | magma_d_matrix | A, |
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
This function does one asynchronous ParILU sweep.
Input and output array are identical.
[in] | A | magma_d_matrix System matrix in COO. |
[in] | L | magma_d_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_d_matrix* Current approximation for the upper triangular factor The format is sorted CSC (U^T in CSR). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilu_sweep_sync | ( | magma_d_matrix | A, |
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep.
Input and output are different arrays.
[in] | A | magma_d_matrix System matrix in COO. |
[in] | L | magma_d_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_d_matrix* Current approximation for the upper triangular factor The format is sorted CSC (U^T in CSR). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparic_sweep | ( | magma_d_matrix | A, |
magma_d_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one asynchronous ParILU sweep (symmetric case).
Input and output array is identical.
[in] | A | magma_d_matrix System matrix in COO. |
[in] | L | magma_d_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparic_sweep_sync | ( | magma_d_matrix | A, |
magma_d_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep (symmetric case).
Input and output are different arrays.
[in] | A | magma_d_matrix System matrix in COO. |
[in] | L | magma_d_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparict_sweep_sync | ( | magma_d_matrix * | A, |
magma_d_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep.
Input and output are different arrays.
[in] | A | magma_d_matrix* System matrix. |
[in] | L | magma_d_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_d_matrix* Current approximation for the upper triangular factor The format is sorted CSC. |
[out] | L_new | magma_d_matrix* Current approximation for the lower triangular factor The format is unsorted CSR. |
[out] | U_new | magma_d_matrix* Current approximation for the upper triangular factor The format is unsorted CSC. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_sweep_sync | ( | magma_d_matrix * | A, |
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILUT sweep.
The difference to the ParILU sweep is that the nonzero pattern of A and the incomplete factors L and U can be different. The pattern determing which elements are iterated are hence the pattern of L and U, not A.
This is the CPU version of the synchronous ParILUT sweep.
[in] | A | magma_d_matrix* System matrix. The format is sorted CSR. |
[in,out] | L | magma_d_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | U | magma_d_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_sweep_gpu | ( | magma_d_matrix * | A, |
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILUT sweep.
The difference to the ParILU sweep is that the nonzero pattern of A and the incomplete factors L and U can be different. The pattern determing which elements are iterated are hence the pattern of L and U, not A. L has a unit diagonal.
This is the GPU version of the asynchronous ParILUT sweep.
[in] | A | magma_d_matrix* System matrix. The format is sorted CSR. |
[in,out] | L | magma_d_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | U | magma_d_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_residuals_gpu | ( | magma_d_matrix | A, |
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix * | R, | ||
magma_queue_t | queue ) |
This function computes the ILU residual in the locations included in the sparsity pattern of R.
[in] | A | magma_d_matrix System matrix. The format is sorted CSR. |
[in] | L | magma_d_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | U | magma_d_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | R | magma_d_matrix* Sparsity pattern on which the ILU residual is computed. R is in COO format. On output, R contains the ILU residual. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dthrsholdrm_gpu | ( | magma_int_t | order, |
magma_d_matrix * | A, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
Purpose
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. @param[in] order magma_int_t dummy variable for now. @param[in,out] A magma_d_matrix* input/output matrix where elements are removed @param[out] thrs double* computed threshold @param[in] queue magma_queue_t Queue to execute in. @ingroup magmasparse_daux
magma_int_t magma_dget_row_ptr | ( | const magma_int_t | num_rows, |
magma_int_t * | nnz, | ||
const magma_index_t * | rowidx, | ||
magma_index_t * | rowptr, | ||
magma_queue_t | queue ) |
magma_int_t magma_dparilut_align_residuals | ( | magma_d_matrix | L, |
magma_d_matrix | U, | ||
magma_d_matrix * | Lnew, | ||
magma_d_matrix * | Unew, | ||
magma_queue_t | queue ) |
This function scales the residuals of a lower triangular factor L with the diagonal of U.
The intention is to generate a good initial guess for inserting the elements.
[in] | L | magma_d_matrix Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_d_matrix Current approximation for the upper triangular factor The format is sorted CSC. |
[in] | hL | magma_d_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | hU | magma_d_matrix* Current approximation for the upper triangular factor The format is sorted CSC. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_preselect_scale | ( | magma_d_matrix * | L, |
magma_d_matrix * | oneL, | ||
magma_d_matrix * | U, | ||
magma_d_matrix * | oneU, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_d_matrix* Matrix where elements are removed. |
[in] | U | magma_d_matrix* Matrix where elements are removed. |
[out] | oneL | magma_d_matrix* Matrix where elements are removed. |
[out] | oneU | magma_d_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_thrsrm_U | ( | magma_int_t | order, |
magma_d_matrix | L, | ||
magma_d_matrix * | A, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing.
[in] | order | magma_int_t order == 1: all elements smaller are discarded order == 0: all elements larger are discarded |
[in,out] | A | magma_d_matrix* Matrix where elements are removed. |
[in] | thrs | double* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_residuals | ( | magma_d_matrix | A, |
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix * | R, | ||
magma_queue_t | queue ) |
This function computes the ILU residual in the locations included in the sparsity pattern of R.
[in] | A | magma_d_matrix System matrix A. |
[in] | L | magma_d_matrix Current approximation for the lower triangular factor. The format is sorted CSR. |
[in] | U | magma_d_matrix Current approximation for the upper triangular factor. The format is sorted CSR. |
[in,out] | R | magma_d_matrix* Sparsity pattern on which the ILU residual is computed. R is in COO format. On output, R contains the ILU residual. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_residuals_transpose | ( | magma_d_matrix | A, |
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_d_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_d_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_residuals_semilinked | ( | magma_d_matrix | A, |
magma_d_matrix | L, | ||
magma_d_matrix | US, | ||
magma_d_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_d_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_d_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_sweep_semilinked | ( | magma_d_matrix * | A, |
magma_d_matrix * | L, | ||
magma_d_matrix * | US, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_d_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_d_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_d_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_sweep_list | ( | magma_d_matrix * | A, |
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_d_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_d_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_d_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_residuals_list | ( | magma_d_matrix | A, |
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_d_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_d_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_sweep_linkedlist | ( | magma_d_matrix * | A, |
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_d_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_d_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_d_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_residuals_linkedlist | ( | magma_d_matrix | A, |
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_d_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_d_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_colmajor | ( | magma_d_matrix | A, |
magma_d_matrix * | AC, | ||
magma_queue_t | queue ) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix.
[in] | A | magma_d_matrix The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | AC | magma_d_matrix* The matrix A but with row-pointer being for col-major, same with linked list. The values, col and row indices are unchanged. The respective pointers point to the entities of A. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_colmajorup | ( | magma_d_matrix | A, |
magma_d_matrix * | AC, | ||
magma_queue_t | queue ) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix.
[in] | A | magma_d_matrix The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | AC | magma_d_matrix* The matrix A but with row-pointer being for col-major, same with linked list. The values, col and row indices are unchanged. The respective pointers point to the entities of A. Already allocated. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparict | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the iterative threshold Incomplete Cholesky preconditioner.
The strategy is interleaving a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this new algorithm has fine-grained parallelism, and we show that it can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2016.
This function requires OpenMP, and is only available if OpenMP is activated.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparict_cpu | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold Cholesky preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern. It is the variant for SPD systems.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the iterative threshold Incomplete LU preconditioner.
The strategy is interleaving a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this new algorithm has fine-grained parallelism, and we show that it can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz constant) precond.rtol : how many candidates are added to the sparsity pattern 1.0 one per row < 1.0 a fraction of those > 1.0 all candidates
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_cpu | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_gpu | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_gpu_nodp | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
This routine is the same as magma_dparilut_gpu(), except that it uses no dynamic paralellism
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_insert | ( | magma_int_t * | num_rmL, |
magma_int_t * | num_rmU, | ||
magma_index_t * | rm_locL, | ||
magma_index_t * | rm_locU, | ||
magma_d_matrix * | L_new, | ||
magma_d_matrix * | U_new, | ||
magma_d_matrix * | L, | ||
magma_d_matrix * | U, | ||
magma_d_matrix * | UR, | ||
magma_queue_t | queue ) |
Inserts for the iterative dynamic ILU an new element in the (empty) place.
[in] | num_rmL | magma_int_t Number of Elements that are replaced in L. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced in U. |
[in] | rm_locL | magma_index_t* List containing the locations of the deleted elements. |
[in] | rm_locU | magma_index_t* List containing the locations of the deleted elements. |
[in] | L_new | magma_d_matrix Elements that will be inserted in L stored in COO format (unsorted). |
[in] | U_new | magma_d_matrix Elements that will be inserted in U stored in COO format (unsorted). |
[in,out] | L | magma_d_matrix* matrix where new elements are inserted. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_d_matrix* matrix where new elements are inserted. Row-major. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in,out] | UR | magma_d_matrix* Same matrix as U, but column-major. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_create_collinkedlist | ( | magma_d_matrix | A, |
magma_d_matrix * | B, | ||
magma_queue_t | queue ) |
For the matrix U in CSR (row-major) this creates B containing a row-ptr to the columns and a linked list for the elements.
[in] | A | magma_d_matrix Matrix to transpose. |
[out] | B | magma_d_matrix* Transposed matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_candidates | ( | magma_d_matrix | L0, |
magma_d_matrix | U0, | ||
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix * | L_new, | ||
magma_d_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_d_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_d_matrix triu( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_d_matrix Current lower triangular factor. |
[in] | U | magma_d_matrix Current upper triangular factor. |
[in,out] | LU_new | magma_d_matrix* List of candidates for L in COO format. |
[in,out] | LU_new | magma_d_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_candidates_gpu | ( | magma_d_matrix | L0, |
magma_d_matrix | U0, | ||
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix * | L_new, | ||
magma_d_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the locations with a potential nonzero ILU residual R = A - L*U where L and U are the current incomplete factors.
Nonzero ILU residuals are possible 1 where A is nonzero but L and U have no nonzero entry 2 where the product L*U has fill-in but the location is not included in L or U
We assume that the incomplete factors are exact fro the elements included in the current pattern.
This is the GPU implementation of the candidate search.
2 GPU kernels are used: the first is a dry run assessing the memory need, the second then computes the candidate locations, the third eliminates double entries. The fourth kernel ensures the elements in a row are sorted for increasing column index.
[in] | L0 | magma_d_matrix tril(ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_d_matrix triu(ILU(0) ) pattern of original system matrix. |
[in] | L | magma_d_matrix Current lower triangular factor. |
[in] | U | magma_d_matrix Current upper triangular factor. |
[in,out] | L_new | magma_d_matrix* List of candidates for L in COO format. |
[in,out] | U_new | magma_d_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparict_candidates | ( | magma_d_matrix | L0, |
magma_d_matrix | L, | ||
magma_d_matrix | LT, | ||
magma_d_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_d_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_d_matrix Current lower triangular factor. |
[in] | LT | magma_d_matrix Transose of the lower triangular factor. |
[in,out] | L_new | magma_d_matrix* List of candidates for L in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_candidates_semilinked | ( | magma_d_matrix | L0, |
magma_d_matrix | U0, | ||
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix | UT, | ||
magma_d_matrix * | L_new, | ||
magma_d_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_d_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_d_matrix triu( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_d_matrix Current lower triangular factor. |
[in] | U | magma_d_matrix Current upper triangular factor transposed. |
[in] | UR | magma_d_matrix Current upper triangular factor - col-pointer and col-list. |
[in,out] | LU_new | magma_d_matrix* List of candidates for L in COO format. |
[in,out] | LU_new | magma_d_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_candidates_linkedlist | ( | magma_d_matrix | L0, |
magma_d_matrix | U0, | ||
magma_d_matrix | L, | ||
magma_d_matrix | U, | ||
magma_d_matrix | UR, | ||
magma_d_matrix * | L_new, | ||
magma_d_matrix * | U_new, | ||
magma_queue_t | queue ) |
magma_int_t magma_dparilut_rm_thrs | ( | double * | thrs, |
magma_int_t * | num_rm, | ||
magma_d_matrix * | LU, | ||
magma_d_matrix * | LU_new, | ||
magma_index_t * | rm_loc, | ||
magma_queue_t | queue ) |
This routine removes matrix entries from the structure that are smaller than the threshold.
It only counts the elements deleted, does not save the locations.
[out] | thrs | double* Thrshold for removing elements. |
[out] | num_rm | magma_int_t* Number of Elements that have been removed. |
[in,out] | LU | magma_d_matrix* Current ILU approximation where the identified smallest components are deleted. |
[in,out] | LUC | magma_d_matrix* Corresponding col-list. |
[in,out] | LU_new | magma_d_matrix* List of candidates in COO format. |
[out] | rm_loc | magma_index_t* List containing the locations of the elements deleted. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_count | ( | magma_d_matrix | L, |
magma_int_t * | num, | ||
magma_queue_t | queue ) |
This is a helper routine counting elements in a matrix in unordered Magma_CSRLIST format.
[in] | L | magma_d_matrix* Matrix in Magm_CSRLIST format |
[out] | num | magma_index_t* Number of elements counted. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_randlist | ( | magma_d_matrix * | LU, |
magma_queue_t | queue ) |
magma_int_t magma_dparilut_select_candidates_L | ( | magma_int_t * | num_rm, |
magma_index_t * | rm_loc, | ||
magma_d_matrix * | L_new, | ||
magma_queue_t | queue ) |
Screens the new candidates for multiple elements in the same row.
We allow for at most one new element per row. This changes the algorithm, but pays off in performance.
[in] | num_rmL | magma_int_t Number of Elements that are replaced. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced. |
[in] | rm_loc | magma_int_t* Number of Elements that are replaced by distinct threads. |
[in] | L_new | magma_d_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | U_new | magma_d_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_select_candidates_U | ( | magma_int_t * | num_rm, |
magma_index_t * | rm_loc, | ||
magma_d_matrix * | L_new, | ||
magma_queue_t | queue ) |
Screens the new candidates for multiple elements in the same row.
We allow for at most one new element per row. This changes the algorithm, but pays off in performance.
[in] | num_rmL | magma_int_t Number of Elements that are replaced. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced. |
[in] | rm_loc | magma_int_t* Number of Elements that are replaced by distinct threads. |
[in] | L_new | magma_d_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | U_new | magma_d_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dparilut_preselect | ( | magma_int_t | order, |
magma_d_matrix * | A, | ||
magma_d_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==0 lower triangular order==1 upper triangular |
[in] | A | magma_d_matrix* Matrix where elements are removed. |
[out] | oneA | magma_d_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dpreselect_gpu | ( | magma_int_t | order, |
magma_d_matrix * | A, | ||
magma_d_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==0 lower triangular order==1 upper triangular |
[in] | A | magma_d_matrix* Matrix where elements are removed. |
[out] | oneA | magma_d_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dsampleselect | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
double * | val, | ||
double * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest.
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | double array containing the values |
[out] | thrs | double* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dsampleselect_approx | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
double * | val, | ||
double * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest.
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | double array containing the values |
[out] | thrs | double* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dsampleselect_nodp | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
double * | val, | ||
double * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest.
This routine does not use dynamic parallelism (DP)
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | double array containing the values |
[out] | thrs | double* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dsampleselect_approx_nodp | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
double * | val, | ||
double * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest.
This routine does not use dynamic parallelism (DP)
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | double array containing the values |
[out] | thrs | double* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmprepare_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_d_matrix | L, | ||
magma_d_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
double * | trisystems, | ||
double * | rhs, | ||
magma_queue_t | queue ) |
Takes a sparse matrix and generates an array containing the sizes of the different systems an array containing the indices with the locations in the sparse matrix where the data comes from and goes back to an array containing all the sparse triangular systems.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_d_matrix Matrix in CSR format |
[in] | LC | magma_d_matrix same matrix, also CSR, but col-major |
[in,out] | sizes | magma_int_t* Number of Elements that are replaced. |
[in,out] | locations | magma_int_t* Array indicating the locations. |
[in,out] | trisystems | double* trisystems |
[in,out] | rhs | double* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmtrisolve_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_d_matrix | L, | ||
magma_d_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
double * | trisystems, | ||
double * | rhs, | ||
magma_queue_t | queue ) |
Does all triangular solves.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_d_matrix Matrix in CSR format |
[in] | LC | magma_d_matrix same matrix, also CSR, but col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | double* trisystems |
[out] | rhs | double* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmbackinsert_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_d_matrix * | M, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
double * | trisystems, | ||
double * | rhs, | ||
magma_queue_t | queue ) |
Inserts the values into the preconditioner matrix.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in,out] | M | magma_d_matrix* SPAI preconditioner CSR col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | double* trisystems |
[out] | rhs | double* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmprepare_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_d_matrix | L, | ||
magma_d_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
double * | trisystems, | ||
double * | rhs, | ||
magma_queue_t | queue ) |
magma_int_t magma_dmtrisolve_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_d_matrix | L, | ||
magma_d_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
double * | trisystems, | ||
double * | rhs, | ||
magma_queue_t | queue ) |
Does all triangular solves.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_d_matrix Matrix in CSR format |
[in] | LC | magma_d_matrix same matrix, also CSR, but col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | double* trisystems |
[out] | rhs | double* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmbackinsert_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_d_matrix * | M, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
double * | trisystems, | ||
double * | rhs, | ||
magma_queue_t | queue ) |
magma_int_t magma_diluisaisetup_lower | ( | magma_d_matrix | L, |
magma_d_matrix | S, | ||
magma_d_matrix * | ISAIL, | ||
magma_queue_t | queue ) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves.
This routine only handles the lower triangular part. The return value is 0 in case of success, and Magma_CUSOLVE if the pattern is too large to be handled.
[in] | L | magma_d_matrix lower triangular factor |
[in] | S | magma_d_matrix pattern for the ISAI preconditioner for L |
[out] | ISAIL | magma_d_matrix* ISAI preconditioner for L |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_diluisaisetup_upper | ( | magma_d_matrix | U, |
magma_d_matrix | S, | ||
magma_d_matrix * | ISAIU, | ||
magma_queue_t | queue ) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves.
This routine only handles the upper triangular part. The return value is 0 in case of success, and Magma_CUSOLVE if the pattern is too large to be handled.
[in] | U | magma_d_matrix lower triangular factor |
[in] | S | magma_d_matrix pattern for the ISAI preconditioner for U |
[out] | ISAIU | magma_d_matrix* ISAI preconditioner for U |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dicisaisetup | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_disai_l | ( | magma_d_matrix | b, |
magma_d_matrix * | x, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Left-hand-side application of ISAI preconditioner.
[in] | b | magma_d_matrix input RHS b |
[in,out] | x | magma_d_matrix solution x |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_disai_r | ( | magma_d_matrix | b, |
magma_d_matrix * | x, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Right-hand-side application of ISAI preconditioner.
[in] | b | magma_d_matrix input RHS b |
[in,out] | x | magma_d_matrix solution x |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_disai_l_t | ( | magma_d_matrix | b, |
magma_d_matrix * | x, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Left-hand-side application of ISAI preconditioner.
Transpose.
[in] | b | magma_d_matrix input RHS b |
[in,out] | x | magma_d_matrix solution x |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_disai_r_t | ( | magma_d_matrix | b, |
magma_d_matrix * | x, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Right-hand-side application of ISAI preconditioner.
Transpose.
[in] | b | magma_d_matrix input RHS b |
[in,out] | x | magma_d_matrix solution x |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmiluisai_sizecheck | ( | magma_d_matrix | A, |
magma_index_t | batchsize, | ||
magma_index_t * | maxsize, | ||
magma_queue_t | queue ) |
magma_int_t magma_dgeisai_maxblock | ( | magma_d_matrix | L, |
magma_d_matrix * | MT, | ||
magma_queue_t | queue ) |
This routine maximizes the pattern for the ISAI preconditioner.
Precisely, it computes L, L^2, L^3, L^4, L^5 and then selects the columns of M_L such that the nonzer-per-column are the lower max than the implementation-specific limit (32).
The input is the original matrix (row-major) The output is already col-major.
[in,out] | L | magma_d_matrix Incomplete factor. |
[in,out] | MT | magma_d_matrix* SPAI preconditioner structure, CSR col-major. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_disai_generator_regs | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_d_matrix | L, | ||
magma_d_matrix * | M, | ||
magma_queue_t | queue ) |
This routine is designet to combine all kernels into one.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_d_matrix triangular factor for which the ISAI matrix is computed. Col-Major CSR storage. |
[in,out] | M | magma_d_matrix* SPAI preconditioner CSR col-major |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmsupernodal | ( | magma_int_t * | max_bs, |
magma_d_matrix | A, | ||
magma_d_matrix * | S, | ||
magma_queue_t | queue ) |
Generates a block-diagonal sparsity pattern with block-size bs.
[in,out] | max_bs | magma_int_t* Size of the largest diagonal block. |
[in] | A | magma_d_matrix System matrix. |
[in,out] | S | magma_d_matrix* Generated sparsity pattern matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmvarsizeblockstruct | ( | magma_int_t | n, |
magma_int_t * | bs, | ||
magma_int_t | bsl, | ||
magma_uplo_t | uplotype, | ||
magma_d_matrix * | A, | ||
magma_queue_t | queue ) |
Generates a block-diagonal sparsity pattern with variable block-size.
[in] | n | magma_int_t Size of the matrix. |
[in] | bs | magma_int_t* Vector containing the size of the diagonal blocks. |
[in] | bsl | magma_int_t Size of the vector containing the block sizes. |
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in,out] | A | magma_d_matrix* Generated sparsity pattern matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dtfqmr_unrolled | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_matrix * | x, | ||
magma_d_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a real matrix A.
This is a GPU implementation of the transpose-free Quasi-Minimal Residual method (TFQMR).
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix RHS b |
[in,out] | x | magma_d_matrix* solution approximation |
[in,out] | solver_par | magma_d_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbicgstab_merge2 | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_matrix * | x, | ||
magma_d_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a general matrix.
This is a GPU implementation of the Biconjugate Gradient Stabilized method. The difference to magma_dbicgstab is that we use specifically designed kernels merging multiple operations into one kernel.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix RHS b |
[in,out] | x | magma_d_matrix* solution approximation |
[in,out] | solver_par | magma_d_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbicgstab_merge3 | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_matrix * | x, | ||
magma_d_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a general matrix.
This is a GPU implementation of the Biconjugate Gradient Stabilized method. The difference to magma_dbicgstab is that we use specifically designed kernels merging multiple operations into one kernel.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix RHS b |
[in,out] | x | magma_d_matrix* solution approximation |
[in,out] | solver_par | magma_d_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_djacobidomainoverlap | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_matrix * | x, | ||
magma_d_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a real symmetric N-by-N positive definite matrix A.
This is a GPU implementation of the Jacobi method allowing for domain overlap.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix RHS b |
[in,out] | x | magma_d_matrix* solution approximation |
[in,out] | solver_par | magma_d_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbaiter | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_matrix * | x, | ||
magma_d_solver_par * | solver_par, | ||
magma_d_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix RHS b |
[in,out] | x | magma_d_matrix* solution approximation |
[in,out] | solver_par | magma_d_solver_par* solver parameters |
[in] | precond_par | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbaiter_overlap | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_matrix * | x, | ||
magma_d_solver_par * | solver_par, | ||
magma_d_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU.
It used restricted additive Schwarz overlap in top-down direction.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix RHS b |
[in,out] | x | magma_d_matrix* solution approximation |
[in,out] | solver_par | magma_d_solver_par* solver parameters |
[in] | precond_par | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dftjacobicontractions | ( | magma_d_matrix | xkm2, |
magma_d_matrix | xkm1, | ||
magma_d_matrix | xk, | ||
magma_d_matrix * | z, | ||
magma_d_matrix * | c, | ||
magma_queue_t | queue ) |
Computes the contraction coefficients c_i:
c_i = z_i^{k-1} / z_i^{k}
= | x_i^{k-1} - x_i^{k-2} | / | x_i^{k} - x_i^{k-1} |
[in] | xkm2 | magma_d_matrix vector x^{k-2} |
[in] | xkm1 | magma_d_matrix vector x^{k-2} |
[in] | xk | magma_d_matrix vector x^{k-2} |
[out] | z | magma_d_matrix* ratio |
[out] | c | magma_d_matrix* contraction coefficients |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dftjacobiupdatecheck | ( | double | delta, |
magma_d_matrix * | xold, | ||
magma_d_matrix * | xnew, | ||
magma_d_matrix * | zprev, | ||
magma_d_matrix | c, | ||
magma_int_t * | flag_t, | ||
magma_int_t * | flag_fp, | ||
magma_queue_t | queue ) |
Checks the Jacobi updates accorting to the condition in the ScaLA'15 paper.
[in] | delta | double threshold |
[in,out] | xold | magma_d_matrix* vector xold |
[in,out] | xnew | magma_d_matrix* vector xnew |
[in,out] | zprev | magma_d_matrix* vector z = | x_k-1 - x_k | |
[in] | c | magma_d_matrix contraction coefficients |
[in,out] | flag_t | magma_int_t threshold condition |
[in,out] | flag_fp | magma_int_t false positive condition |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_diterref | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_matrix * | x, | ||
magma_d_solver_par * | solver_par, | ||
magma_d_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a real symmetric N-by-N positive definite matrix A.
This is a GPU implementation of the Iterative Refinement method. The inner solver is passed via the preconditioner argument.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix RHS b |
[in,out] | x | magma_d_matrix* solution approximation |
[in,out] | solver_par | magma_d_solver_par* solver parameters |
[in,out] | precond_par | magma_d_preconditioner* inner solver |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_djacobiiter_sys | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_matrix | d, | ||
magma_d_matrix | t, | ||
magma_d_matrix * | x, | ||
magma_d_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.
This routine takes the system matrix A and the RHS b as input.
[in] | A | magma_d_matrix input matrix M = D^(-1) * (L+U) |
[in] | b | magma_d_matrix input RHS b |
[in] | d | magma_d_matrix input matrix diagonal elements diag(A) |
[in] | t | magma_d_matrix temporary vector |
[in,out] | x | magma_d_matrix* iteration vector x |
[in,out] | solver_par | magma_d_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dftjacobi | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_matrix * | x, | ||
magma_d_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.
This routine takes the system matrix A and the RHS b as input. This is the fault-tolerant version of Jacobi according to ScalLA'15.
[in] | A | magma_d_matrix input matrix M = D^(-1) * (L+U) |
[in] | b | magma_d_matrix input RHS b |
[in,out] | x | magma_d_matrix* iteration vector x |
[in,out] | solver_par | magma_d_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dilut_saad | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_dilut_saad_apply | ( | magma_d_matrix | b, |
magma_d_matrix * | x, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_dcustomilusetup | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Reads in an Incomplete LU preconditioner.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dcustomicsetup | ( | magma_d_matrix | A, |
magma_d_matrix | b, | ||
magma_d_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Reads in an Incomplete Cholesky preconditioner.
[in] | A | magma_d_matrix input matrix A |
[in] | b | magma_d_matrix input RHS b |
[in,out] | precond | magma_d_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbajac_csr | ( | magma_int_t | localiters, |
magma_d_matrix | D, | ||
magma_d_matrix | R, | ||
magma_d_matrix | b, | ||
magma_d_matrix * | x, | ||
magma_queue_t | queue ) |
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.
Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.
[in] | localiters | magma_int_t number of local Jacobi-like updates |
[in] | D | magma_d_matrix input matrix with diagonal blocks |
[in] | R | magma_d_matrix input matrix with non-diagonal parts |
[in] | b | magma_d_matrix RHS |
[in] | x | magma_d_matrix* iterate/solution |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbajac_csr_overlap | ( | magma_int_t | localiters, |
magma_int_t | matrices, | ||
magma_int_t | overlap, | ||
magma_d_matrix * | D, | ||
magma_d_matrix * | R, | ||
magma_d_matrix | b, | ||
magma_d_matrix * | x, | ||
magma_queue_t | queue ) |
This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block.
Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.
[in] | localiters | magma_int_t number of local Jacobi-like updates |
[in] | matrices | magma_int_t number of sub-matrices |
[in] | overlap | magma_int_t size of the overlap |
[in] | D | magma_d_matrix* set of matrices with diagonal blocks |
[in] | R | magma_d_matrix* set of matrices with non-diagonal parts |
[in] | b | magma_d_matrix RHS |
[in] | x | magma_d_matrix* iterate/solution |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmlumerge | ( | magma_d_matrix | L, |
magma_d_matrix | U, | ||
magma_d_matrix * | A, | ||
magma_queue_t | queue ) |
Takes an strictly lower triangular matrix L and an upper triangular matrix U and merges them into a matrix A containing the upper and lower triangular parts.
[in] | L | magma_d_matrix input strictly lower triangular matrix L |
[in] | U | magma_d_matrix input upper triangular matrix U |
[out] | A | magma_d_matrix* output matrix |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgeaxpy | ( | double | alpha, |
magma_d_matrix | X, | ||
double | beta, | ||
magma_d_matrix * | Y, | ||
magma_queue_t | queue ) |
This routine computes Y = alpha * X + beta * Y on the GPU.
The input format is magma_d_matrix. It can handle both, dense matrix (vector block) and CSR matrices. For the latter, it interfaces the cuSPARSE library.
[in] | alpha | double scalar multiplier. |
[in] | X | magma_d_matrix input/output matrix Y. |
[in] | beta | double scalar multiplier. |
[in,out] | Y | magma_d_matrix* input matrix X. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgecsrreimsplit | ( | magma_d_matrix | A, |
magma_d_matrix * | ReA, | ||
magma_d_matrix * | ImA, | ||
magma_queue_t | queue ) |
This routine takes an input matrix A in CSR format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A.
The output matrices are allocated within the routine.
[in] | A | magma_d_matrix input matrix A. |
[out] | ReA | magma_d_matrix* output matrix contaning real contributions. |
[out] | ImA | magma_d_matrix* output matrix contaning real contributions. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgedensereimsplit | ( | magma_d_matrix | A, |
magma_d_matrix * | ReA, | ||
magma_d_matrix * | ImA, | ||
magma_queue_t | queue ) |
This routine takes an input matrix A in DENSE format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A.
The output matrices are allocated within the routine.
[in] | A | magma_d_matrix input matrix A. |
[out] | ReA | magma_d_matrix* output matrix contaning real contributions. |
[out] | ImA | magma_d_matrix* output matrix contaning real contributions. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dgecsr5mv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | p, | ||
double | alpha, | ||
magma_int_t | sigma, | ||
magma_int_t | bit_y_offset, | ||
magma_int_t | bit_scansum_offset, | ||
magma_int_t | num_packet, | ||
magmaUIndex_ptr | dtile_ptr, | ||
magmaUIndex_ptr | dtile_desc, | ||
magmaIndex_ptr | dtile_desc_offset_ptr, | ||
magmaIndex_ptr | dtile_desc_offset, | ||
magmaDouble_ptr | dcalibrator, | ||
magma_int_t | tail_tile_start, | ||
magmaDouble_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
The input format is CSR5 (val (tile-wise column-major), row_pointer, col (tile-wise column-major), tile_pointer, tile_desc).
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | p | magma_int_t number of tiles in A |
[in] | alpha | double scalar multiplier |
[in] | sigma | magma_int_t sigma in A in CSR5 |
[in] | bit_y_offset | magma_int_t bit_y_offset in A in CSR5 |
[in] | bit_scansum_offset | magma_int_t bit_scansum_offset in A in CSR5 |
[in] | num_packet | magma_int_t num_packet in A in CSR5 |
[in] | dtile_ptr | magmaUIndex_ptr tilepointer of A in CSR5 |
[in] | dtile_desc | magmaUIndex_ptr tiledescriptor of A in CSR5 |
[in] | dtile_desc_offset_ptr | magmaIndex_ptr tiledescriptor_offsetpointer of A in CSR5 |
[in] | dtile_desc_offset | magmaIndex_ptr tiledescriptor_offsetpointer of A in CSR5 |
[in] | dcalibrator | magmaDouble_ptr calibrator of A in CSR5 |
[in] | tail_tile_start | magma_int_t start of the last tile in A |
[in] | dval | magmaDouble_ptr array containing values of A in CSR |
[in] | dval | magmaDouble_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dcopyscale | ( | magma_int_t | n, |
magma_int_t | k, | ||
magmaDouble_ptr | r, | ||
magmaDouble_ptr | v, | ||
magmaDouble_ptr | skp, | ||
magma_queue_t | queue ) |
Computes the correction term of the pipelined GMRES according to P.
Ghysels and scales and copies the new search direction
Returns the vector v = r/ ( skp[k] - (sum_i=1^k skp[i]^2) ) .
[in] | n | int length of v_i |
[in] | k | int |
[in] | r | magmaDouble_ptr vector of length n |
[in] | v | magmaDouble_ptr vector of length n |
[in] | skp | magmaDouble_ptr array of parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dnrm2scale | ( | magma_int_t | m, |
magmaDouble_ptr | dr, | ||
magma_int_t | lddr, | ||
double * | drnorm, | ||
magma_queue_t | queue ) |
magma_int_t magma_djacobispmvupdate_bw | ( | magma_int_t | maxiter, |
magma_d_matrix | A, | ||
magma_d_matrix | t, | ||
magma_d_matrix | b, | ||
magma_d_matrix | d, | ||
magma_d_matrix * | x, | ||
magma_queue_t | queue ) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d.
*(b-Ax) This kernel processes the thread blocks in reversed order.
[in] | maxiter | magma_int_t number of Jacobi iterations |
[in] | A | magma_d_matrix system matrix |
[in] | t | magma_d_matrix workspace |
[in] | b | magma_d_matrix RHS b |
[in] | d | magma_d_matrix vector with diagonal entries |
[out] | x | magma_d_matrix* iteration vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_djacobispmvupdateselect | ( | magma_int_t | maxiter, |
magma_int_t | num_updates, | ||
magma_index_t * | indices, | ||
magma_d_matrix | A, | ||
magma_d_matrix | t, | ||
magma_d_matrix | b, | ||
magma_d_matrix | d, | ||
magma_d_matrix | tmp, | ||
magma_d_matrix * | x, | ||
magma_queue_t | queue ) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d.
*(b-Ax)
This kernel allows for overlapping domains: the indices-array contains the locations that are updated. Locations may be repeated to simulate overlapping domains.
[in] | maxiter | magma_int_t number of Jacobi iterations |
[in] | num_updates | magma_int_t number of updates - length of the indices array |
[in] | indices | magma_index_t* indices, which entries of x to update |
[in] | A | magma_d_matrix system matrix |
[in] | t | magma_d_matrix workspace |
[in] | b | magma_d_matrix RHS b |
[in] | d | magma_d_matrix vector with diagonal entries |
[in] | tmp | magma_d_matrix workspace |
[out] | x | magma_d_matrix* iteration vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dmergeblockkrylov | ( | magma_int_t | num_rows, |
magma_int_t | num_cols, | ||
magmaDouble_ptr | alpha, | ||
magmaDouble_ptr | p, | ||
magmaDouble_ptr | x, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
v = y / rho y = y / rho w = wt / psi z = z / psi
[in] | num_rows | magma_int_t dimension m |
[in] | num_cols | magma_int_t dimension n |
[in] | alpha | magmaDouble_ptr matrix containing all SKP |
[in] | p | magmaDouble_ptr search directions |
[in,out] | x | magmaDouble_ptr approximation vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbicgmerge1 | ( | magma_int_t | n, |
magmaDouble_ptr | skp, | ||
magmaDouble_ptr | v, | ||
magmaDouble_ptr | r, | ||
magmaDouble_ptr | p, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
p = beta*p p = p-omega*beta*v p = p+r
-> p = r + beta * ( p - omega * v )
[in] | n | int dimension n |
[in] | skp | magmaDouble_ptr set of scalar parameters |
[in] | v | magmaDouble_ptr input vector v |
[in] | r | magmaDouble_ptr input vector r |
[in,out] | p | magmaDouble_ptr input/output vector p |
[in] | queue | magma_queue_t queue to execute in. |
magma_int_t magma_dbicgmerge2 | ( | magma_int_t | n, |
magmaDouble_ptr | skp, | ||
magmaDouble_ptr | r, | ||
magmaDouble_ptr | v, | ||
magmaDouble_ptr | s, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
s=r s=s-alpha*v
-> s = r - alpha * v
[in] | n | int dimension n |
[in] | skp | magmaDouble_ptr set of scalar parameters |
[in] | r | magmaDouble_ptr input vector r |
[in] | v | magmaDouble_ptr input vector v |
[out] | s | magmaDouble_ptr output vector s |
[in] | queue | magma_queue_t queue to execute in. |
magma_int_t magma_dbicgmerge3 | ( | magma_int_t | n, |
magmaDouble_ptr | skp, | ||
magmaDouble_ptr | p, | ||
magmaDouble_ptr | s, | ||
magmaDouble_ptr | t, | ||
magmaDouble_ptr | x, | ||
magmaDouble_ptr | r, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
x=x+alpha*p x=x+omega*s r=s r=r-omega*t
-> x = x + alpha * p + omega * s -> r = s - omega * t
[in] | n | int dimension n |
[in] | skp | magmaDouble_ptr set of scalar parameters |
[in] | p | magmaDouble_ptr input p |
[in] | s | magmaDouble_ptr input s |
[in] | t | magmaDouble_ptr input t |
[in,out] | x | magmaDouble_ptr input/output x |
[in,out] | r | magmaDouble_ptr input/output r |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbicgmerge4 | ( | magma_int_t | type, |
magmaDouble_ptr | skp, | ||
magma_queue_t | queue ) |
Performs some parameter operations for the BiCGSTAB with scalars on GPU.
[in] | type | int kernel type |
[in,out] | skp | magmaDouble_ptr vector with parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbicgmerge_spmv1 | ( | magma_d_matrix | A, |
magmaDouble_ptr | d1, | ||
magmaDouble_ptr | d2, | ||
magmaDouble_ptr | dp, | ||
magmaDouble_ptr | dr, | ||
magmaDouble_ptr | dv, | ||
magmaDouble_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the first SpmV using CSR with the dot product and the computation of alpha.
[in] | A | magma_d_matrix system matrix |
[in] | d1 | magmaDouble_ptr temporary vector |
[in] | d2 | magmaDouble_ptr temporary vector |
[in] | dp | magmaDouble_ptr input vector p |
[in] | dr | magmaDouble_ptr input vector r |
[in] | dv | magmaDouble_ptr output vector v |
[in,out] | skp | magmaDouble_ptr array for parameters ( skp[0]=alpha ) |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbicgmerge_spmv2 | ( | magma_d_matrix | A, |
magmaDouble_ptr | d1, | ||
magmaDouble_ptr | d2, | ||
magmaDouble_ptr | ds, | ||
magmaDouble_ptr | dt, | ||
magmaDouble_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the second SpmV using CSR with the dot product and the computation of omega.
[in] | A | magma_d_matrix input matrix |
[in] | d1 | magmaDouble_ptr temporary vector |
[in] | d2 | magmaDouble_ptr temporary vector |
[in] | ds | magmaDouble_ptr input vector s |
[in] | dt | magmaDouble_ptr output vector t |
[in,out] | skp | magmaDouble_ptr array for parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbicgmerge_xrbeta | ( | magma_int_t | n, |
magmaDouble_ptr | d1, | ||
magmaDouble_ptr | d2, | ||
magmaDouble_ptr | rr, | ||
magmaDouble_ptr | r, | ||
magmaDouble_ptr | p, | ||
magmaDouble_ptr | s, | ||
magmaDouble_ptr | t, | ||
magmaDouble_ptr | x, | ||
magmaDouble_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the second SpmV using CSR with the dot product and the computation of omega.
[in] | n | int dimension n |
[in] | d1 | magmaDouble_ptr temporary vector |
[in] | d2 | magmaDouble_ptr temporary vector |
[in] | rr | magmaDouble_ptr input vector rr |
[in] | r | magmaDouble_ptr input/output vector r |
[in] | p | magmaDouble_ptr input vector p |
[in] | s | magmaDouble_ptr input vector s |
[in] | t | magmaDouble_ptr input vector t |
[out] | x | magmaDouble_ptr output vector x |
[in] | skp | magmaDouble_ptr array for parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dbcsrswp | ( | magma_int_t | n, |
magma_int_t | size_b, | ||
magma_int_t * | ipiv, | ||
magmaDouble_ptr | dx, | ||
magma_queue_t | queue ) |
magma_int_t magma_dbcsrtrsv | ( | magma_uplo_t | uplo, |
magma_int_t | r_blocks, | ||
magma_int_t | c_blocks, | ||
magma_int_t | size_b, | ||
magmaDouble_ptr | dA, | ||
magma_index_t * | blockinfo, | ||
magmaDouble_ptr | dx, | ||
magma_queue_t | queue ) |
magma_int_t magma_dbcsrvalcpy | ( | magma_int_t | size_b, |
magma_int_t | num_blocks, | ||
magma_int_t | num_zero_blocks, | ||
magmaDouble_ptr * | dAval, | ||
magmaDouble_ptr * | dBval, | ||
magmaDouble_ptr * | dBval2, | ||
magma_queue_t | queue ) |
magma_int_t magma_dbcsrluegemm | ( | magma_int_t | size_b, |
magma_int_t | num_block_rows, | ||
magma_int_t | kblocks, | ||
magmaDouble_ptr * | dA, | ||
magmaDouble_ptr * | dB, | ||
magmaDouble_ptr * | dC, | ||
magma_queue_t | queue ) |
magma_int_t magma_dbcsrlupivloc | ( | magma_int_t | size_b, |
magma_int_t | kblocks, | ||
magmaDouble_ptr * | dA, | ||
magma_int_t * | ipiv, | ||
magma_queue_t | queue ) |
magma_int_t magma_dbcsrblockinfo5 | ( | magma_int_t | lustep, |
magma_int_t | num_blocks, | ||
magma_int_t | c_blocks, | ||
magma_int_t | size_b, | ||
magma_index_t * | blockinfo, | ||
magmaDouble_ptr | dval, | ||
magmaDouble_ptr * | AII, | ||
magma_queue_t | queue ) |
magma_int_t magma_dthrsholdselect | ( | magma_int_t | sampling, |
magma_int_t | total_size, | ||
magma_int_t | subset_size, | ||
double * | val, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_vector_dlag2s | ( | magma_d_matrix | x, |
magma_s_matrix * | y, | ||
magma_queue_t | queue ) |
magma_int_t magma_sparse_matrix_dlag2s | ( | magma_d_matrix | A, |
magma_s_matrix * | B, | ||
magma_queue_t | queue ) |
magma_int_t magma_vector_slag2d | ( | magma_s_matrix | x, |
magma_d_matrix * | y, | ||
magma_queue_t | queue ) |
magma_int_t magma_sparse_matrix_slag2d | ( | magma_s_matrix | A, |
magma_d_matrix * | B, | ||
magma_queue_t | queue ) |
void magmablas_dlag2s_sparse | ( | magma_int_t | M, |
magma_int_t | N, | ||
magmaDouble_const_ptr | dA, | ||
magma_int_t | lda, | ||
magmaFloat_ptr | dSA, | ||
magma_int_t | ldsa, | ||
magma_queue_t | queue, | ||
magma_int_t * | info ) |
void magmablas_slag2d_sparse | ( | magma_int_t | M, |
magma_int_t | N, | ||
magmaFloat_const_ptr | dSA, | ||
magma_int_t | ldsa, | ||
magmaDouble_ptr | dA, | ||
magma_int_t | lda, | ||
magma_queue_t | queue, | ||
magma_int_t * | info ) |
void magma_dlag2s_CSR_DENSE | ( | magma_d_matrix | A, |
magma_s_matrix * | B, | ||
magma_queue_t | queue ) |
void magma_dlag2s_CSR_DENSE_alloc | ( | magma_d_matrix | A, |
magma_s_matrix * | B, | ||
magma_queue_t | queue ) |
void magma_dlag2s_CSR_DENSE_convert | ( | magma_d_matrix | A, |
magma_s_matrix * | B, | ||
magma_queue_t | queue ) |
magma_int_t magma_dsgecsrmv_mixed_prec | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
double | alpha, | ||
magmaDouble_ptr | ddiagval, | ||
magmaFloat_ptr | doffdiagval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDouble_ptr | dx, | ||
double | beta, | ||
magmaDouble_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
A is a matrix in mixed precision, i.e. the diagonal values are stored in high precision, the offdiagonal values in low precision. The input format is a CSR (val, row, col) in FloatComplex storing all offdiagonal elements and an array containing the diagonal values in DoubleComplex.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | alpha | double scalar multiplier |
[in] | ddiagval | magmaDouble_ptr array containing diagonal values of A in DoubleComplex |
[in] | doffdiagval | magmaFloat_ptr array containing offdiag values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDouble_ptr input vector x |
[in] | beta | double scalar multiplier |
[out] | dy | magmaDouble_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
int mm_write_mtx_crd | ( | char | fname[], |
magma_index_t | M, | ||
magma_index_t | N, | ||
magma_index_t | nz, | ||
magma_index_t | I[], | ||
magma_index_t | J[], | ||
double | val[], | ||
MM_typecode | matcode ) |
int mm_read_mtx_crd_data | ( | FILE * | f, |
magma_index_t | M, | ||
magma_index_t | N, | ||
magma_index_t | nz, | ||
magma_index_t | I[], | ||
magma_index_t | J[], | ||
double | val[], | ||
MM_typecode | matcode ) |
int mm_read_mtx_crd_entry | ( | FILE * | f, |
magma_index_t * | I, | ||
magma_index_t * | J, | ||
double * | real, | ||
double * | img, | ||
MM_typecode | matcode ) |
int mm_read_unsymmetric_sparse | ( | const char * | fname, |
magma_index_t * | M_, | ||
magma_index_t * | N_, | ||
magma_index_t * | nz_, | ||
double ** | val_, | ||
magma_index_t ** | I_, | ||
magma_index_t ** | J_ ) |
magma_int_t magma_sorderstatistics | ( | float * | val, |
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | r, | ||
float * | element, | ||
magma_queue_t | queue ) |
Identifies the kth smallest/largest element in an array.
[in,out] | val | float* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | float* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sorderstatistics_inc | ( | float * | val, |
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | inc, | ||
magma_int_t | r, | ||
float * | element, | ||
magma_queue_t | queue ) |
Approximates the k-th smallest element in an array by using order-statistics with step-size inc.
[in,out] | val | float* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | inc | magma_int_t Stepsize in the approximation. |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | float* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smorderstatistics | ( | float * | val, |
magma_index_t * | col, | ||
magma_index_t * | row, | ||
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | r, | ||
float * | element, | ||
magma_queue_t | queue ) |
Identifies the kth smallest/largest element in an array and reorders such that these elements come to the front.
The related arrays col and row are also reordered.
[in,out] | val | float* Target array, will be modified during operation. |
[in,out] | col | magma_index_t* Target array, will be modified during operation. |
[in,out] | row | magma_index_t* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | float* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_spartition | ( | float * | a, |
magma_int_t | size, | ||
magma_int_t | pivot, | ||
magma_queue_t | queue ) |
magma_int_t magma_smedian5 | ( | float * | a, |
magma_queue_t | queue ) |
magma_int_t magma_sselect | ( | float * | a, |
magma_int_t | size, | ||
magma_int_t | k, | ||
magma_queue_t | queue ) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm.
Derrick Coetzee, webma.nosp@m.ster.nosp@m.@moon.nosp@m.flar.nosp@m.e.com January 22, 2004 http://moonflare.com/code/select/select.pdf
[in,out] | a | float* array to select from |
[in] | size | magma_int_t size of array |
[in] | k | magma_int_t k-th smallest element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sselectrandom | ( | float * | a, |
magma_int_t | size, | ||
magma_int_t | k, | ||
magma_queue_t | queue ) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm.
Derrick Coetzee, webma.nosp@m.ster.nosp@m.@moon.nosp@m.flar.nosp@m.e.com January 22, 2004 http://moonflare.com/code/select/select.pdf
[in,out] | a | float* array to select from |
[in] | size | magma_int_t size of array |
[in] | k | magma_int_t k-th smallest element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sdomainoverlap | ( | magma_index_t | num_rows, |
magma_int_t * | num_indices, | ||
magma_index_t * | rowptr, | ||
magma_index_t * | colidx, | ||
magma_index_t * | x, | ||
magma_queue_t | queue ) |
Generates the update list.
[in] | x | magma_index_t* array to sort |
[in] | num_rows | magma_int_t number of rows in matrix |
[out] | num_indices | magma_int_t* number of indices in array |
[in] | rowptr | magma_index_t* rowpointer of matrix |
[in] | colidx | magma_index_t* colindices of matrix |
[in] | x | magma_index_t* array containing indices for domain overlap |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_svspread | ( | magma_s_matrix * | x, |
const char * | filename, | ||
magma_queue_t | queue ) |
Reads in a sparse vector-block stored in COO format.
[out] | x | magma_s_matrix * vector to read in |
[in] | filename | char* file where vector is stored |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sdiameter | ( | magma_s_matrix * | A, |
magma_queue_t | queue ) |
Computes the diameter of a sparse matrix and stores the value in diameter.
[in,out] | A | magma_s_matrix* sparse matrix |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilusetup | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the ILU preconditioner via the iterative ILU iteration.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilu_gpu | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an ILU(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the GPU implementation of the ParILU
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilu_cpu | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an ILU(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the CPU implementation of the ParILU
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparic_gpu | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an IC(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the GPU implementation of the ParIC
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparic_cpu | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an IC(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the CPU implementation of the ParIC
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparicsetup | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the IC preconditioner via the iterative IC iteration.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparicupdate | ( | magma_s_matrix | A, |
magma_s_preconditioner * | precond, | ||
magma_int_t | updates, | ||
magma_queue_t | queue ) |
Updates an existing preconditioner via additional iterative IC sweeps for previous factorization initial guess (PFIG).
See Anzt et al., Parallel Computing, 2015.
[in] | A | magma_s_matrix input matrix A, current target system |
[in] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | updates | magma_int_t number of updates |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sapplyiteric_l | ( | magma_s_matrix | b, |
magma_s_matrix * | x, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Performs the left triangular solves using the IC preconditioner via Jacobi.
[in] | b | magma_s_matrix RHS |
[out] | x | magma_s_matrix* vector to precondition |
[in] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sapplyiteric_r | ( | magma_s_matrix | b, |
magma_s_matrix * | x, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Performs the right triangular solves using the IC preconditioner via Jacobi.
[in] | b | magma_s_matrix RHS |
[out] | x | magma_s_matrix* vector to precondition |
[in] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilu_csr | ( | magma_s_matrix | A, |
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_queue_t | queue ) |
This routine iteratively computes an incomplete LU factorization.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1-16, 2015.
The input format of the system matrix is COO, the lower triangular factor L is stored in CSR, the upper triangular factor U is transposed, then also stored in CSR (equivalent to CSC format for the non-transposed U). Every component of L and U is handled by one thread.
[in] | A | magma_s_matrix input matrix A determing initial guess & processing order |
[in,out] | L | magma_s_matrix input/output matrix L containing the lower triangular factor |
[in,out] | U | magma_s_matrix input/output matrix U containing the upper triangular factor |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_spariluupdate | ( | magma_s_matrix | A, |
magma_s_preconditioner * | precond, | ||
magma_int_t | updates, | ||
magma_queue_t | queue ) |
Updates an existing preconditioner via additional iterative ILU sweeps for previous factorization initial guess (PFIG).
See Anzt et al., Parallel Computing, 2015.
[in] | A | magma_s_matrix input matrix A, current target system |
[in] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | updates | magma_int_t number of updates |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparic_csr | ( | magma_s_matrix | A, |
magma_s_matrix | A_CSR, | ||
magma_queue_t | queue ) |
This routine iteratively computes an incomplete LU factorization.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1-16, 2015.
The input format of the initial guess matrix A is Magma_CSRCOO, A_CSR is CSR or CSRCOO format.
[in] | A | magma_s_matrix input matrix A - initial guess (lower triangular) |
[in,out] | A_CSR | magma_s_matrix input/output matrix containing the IC approximation |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_snonlinres | ( | magma_s_matrix | A, |
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix * | LU, | ||
real_Double_t * | res, | ||
magma_queue_t | queue ) |
Computes the nonlinear residual A - LU and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_s_matrix input sparse matrix in CSR |
[in] | L | magma_s_matrix input sparse matrix in CSR |
[in] | U | magma_s_matrix input sparse matrix in CSR |
[out] | LU | magma_s_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* Frobenius norm of difference |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_silures | ( | magma_s_matrix | A, |
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix * | LU, | ||
real_Double_t * | res, | ||
real_Double_t * | nonlinres, | ||
magma_queue_t | queue ) |
Computes the ILU residual A - LU and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_s_matrix input sparse matrix in CSR |
[in] | L | magma_s_matrix input sparse matrix in CSR |
[in] | U | magma_s_matrix input sparse matrix in CSR |
[out] | LU | magma_s_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* Frobenius norm of difference |
[out] | nonlinres | real_Double_t* Frobenius norm of difference |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sicres | ( | magma_s_matrix | A, |
magma_s_matrix | C, | ||
magma_s_matrix | CT, | ||
magma_s_matrix * | LU, | ||
real_Double_t * | res, | ||
real_Double_t * | nonlinres, | ||
magma_queue_t | queue ) |
Computes the IC residual A - CC^T and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_s_matrix input sparse matrix in CSR |
[in] | C | magma_s_matrix input sparse matrix in CSR |
[in] | CT | magma_s_matrix input sparse matrix in CSR |
[in] | LU | magma_s_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* IC residual |
[out] | nonlinres | real_Double_t* nonlinear residual |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sinitguess | ( | magma_s_matrix | A, |
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
Computes an initial guess for the ParILU/ParIC.
[in] | A | magma_s_matrix sparse matrix in CSR |
[out] | L | magma_s_matrix* sparse matrix in CSR |
[out] | U | magma_s_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sinitrecursiveLU | ( | magma_s_matrix | A, |
magma_s_matrix * | B, | ||
magma_queue_t | queue ) |
Using the iterative approach of computing ILU factorizations with increasing fill-in, it takes the input matrix A, containing the approximate factors, ( L and U as well ) computes a matrix with one higher level of fill-in, inserts the original approximation as initial guess, and provides the factors L and U also filled with the scaled initial guess.
[in] | A | magma_s_matrix* sparse matrix in CSR |
[out] | B | magma_s_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smLdiagadd | ( | magma_s_matrix * | L, |
magma_queue_t | queue ) |
Checks for a lower triangular matrix whether it is strictly lower triangular and in the negative case adds a unit diagonal.
It does this in-place.
[in,out] | L | magma_s_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smatrix_cup | ( | magma_s_matrix | A, |
magma_s_matrix | B, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix \(U = A \cup B\).
If both matrices have a nonzero value in the same location, the value of A is used.
[in] | A | magma_s_matrix Input matrix 1. |
[in] | B | magma_s_matrix Input matrix 2. |
[out] | U | magma_s_matrix* Not a real matrix, but the list of all matrix entries included in either A or B. No duplicates. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smatrix_cup_gpu | ( | magma_s_matrix | A, |
magma_s_matrix | B, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix \(U = A \cup B\).
If both matrices have a nonzero value in the same location, the value of A is used.
This is the GPU version of the operation.
[in] | A | magma_s_matrix Input matrix 1. |
[in] | B | magma_s_matrix Input matrix 2. |
[out] | U | magma_s_matrix* \(U = A \cup B\). If both matrices have a nonzero value in the same location, the value of A is used. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smatrix_cap | ( | magma_s_matrix | A, |
magma_s_matrix | B, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix with entries being in both matrices: \(U = A \cap B\).
The values in U are all ones.
[in] | A | magma_s_matrix Input matrix 1. |
[in] | B | magma_s_matrix Input matrix 2. |
[out] | U | magma_s_matrix* Not a real matrix, but the list of all matrix entries included in both A and B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smatrix_negcap | ( | magma_s_matrix | A, |
magma_s_matrix | B, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a list of matrix entries being part of A but not of B.
U = A \ B The values of A are preserved.
[in] | A | magma_s_matrix Element part of this. |
[in,out] | B | magma_s_matrix Not part of this. |
[out] | U | magma_s_matrix* Not a real matrix, but the list of all matrix entries included in A not in B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smatrix_tril_negcap | ( | magma_s_matrix | A, |
magma_s_matrix | B, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a list of matrix entries being part of tril(A) but not of B.
U = tril(A) \ B The values of A are preserved.
[in] | A | magma_s_matrix Element part of this. |
[in,out] | B | magma_s_matrix Not part of this. |
[out] | U | magma_s_matrix* Not a real matrix, but the list of all matrix entries included in A not in B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smatrix_triu_negcap | ( | magma_s_matrix | A, |
magma_s_matrix | B, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix with entries being part of triu(A) but not of B.
U = triu(A) \ B The values of A are preserved.
[in] | A | magma_s_matrix Element part of this. |
[in] | B | magma_s_matrix Not part of this. |
[out] | U | magma_s_matrix* |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smatrix_abssum | ( | magma_s_matrix | A, |
float * | sum, | ||
magma_queue_t | queue ) |
Computes the sum of the absolute values in a matrix.
[in] | A | magma_s_matrix Element list/matrix. |
[out] | sum | float* Sum of the absolute values. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_thrsrm | ( | magma_int_t | order, |
magma_s_matrix * | A, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing.
[in] | order | magma_int_t order == 1: all elements smaller are discarded order == 0: all elements larger are discarded |
[in,out] | A | magma_s_matrix* Matrix where elements are removed. |
[in] | thrs | float* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_thrsrm_semilinked | ( | magma_s_matrix * | U, |
magma_s_matrix * | US, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller thrs from the matrix.
It only uses the linked list and skips the `‘removed’' elements
[in,out] | A | magma_s_matrix* Matrix where elements are removed. |
[in] | thrs | float* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_rmselected | ( | magma_s_matrix | R, |
magma_s_matrix * | A, | ||
magma_queue_t | queue ) |
Removes a selected list of elements from the matrix.
[in] | R | magma_s_matrix Matrix containing elements to be removed. |
[in,out] | A | magma_s_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_selectoneperrow | ( | magma_int_t | order, |
magma_s_matrix * | A, | ||
magma_s_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==1 -> largest order==0 -> smallest |
[in] | A | magma_s_matrix* Matrix where elements are removed. |
[out] | oneA | magma_s_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_selecttwoperrow | ( | magma_int_t | order, |
magma_s_matrix * | A, | ||
magma_s_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==1 -> largest order==0 -> smallest |
[in] | A | magma_s_matrix* Matrix where elements are removed. |
[out] | oneA | magma_s_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_selectoneperrowthrs_lower | ( | magma_s_matrix | L, |
magma_s_matrix | U, | ||
magma_s_matrix * | A, | ||
float | rtol, | ||
magma_s_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_s_matrix Current lower triangular factor. |
[in] | U | magma_s_matrix Current upper triangular factor. |
[in] | A | magma_s_matrix* All residuals in L. |
[in] | rtol | threshold rtol |
[out] | oneA | magma_s_matrix* at most one per row, if larger thrs. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_selectoneperrowthrs_upper | ( | magma_s_matrix | L, |
magma_s_matrix | U, | ||
magma_s_matrix * | A, | ||
float | rtol, | ||
magma_s_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_s_matrix Current lower triangular factor. |
[in] | U | magma_s_matrix Current upper triangular factor. |
[in] | A | magma_s_matrix* All residuals in L. |
[in] | rtol | threshold rtol |
[out] | oneA | magma_s_matrix* at most one per row, if larger thrs. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_selectonepercol | ( | magma_int_t | order, |
magma_s_matrix * | A, | ||
magma_s_matrix * | oneA, | ||
magma_queue_t | queue ) |
magma_int_t magma_sparilut_transpose_select_one | ( | magma_s_matrix | A, |
magma_s_matrix * | B, | ||
magma_queue_t | queue ) |
This is a special routine with very limited scope.
For a set of fill-in candidates in row-major format, it transposes the a submatrix, i.e. the submatrix consisting of the largest element in every column. This function is only useful for delta<=1.
[in] | A | magma_s_matrix Matrix to transpose. |
[out] | B | magma_s_matrix* Transposed matrix containing only largest elements in each col. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_insert_LU | ( | magma_int_t | num_rm, |
magma_index_t * | rm_loc, | ||
magma_index_t * | rm_loc2, | ||
magma_s_matrix * | LU_new, | ||
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
magma_int_t magma_sparilut_set_thrs | ( | magma_int_t | num_rm, |
magma_s_matrix * | LU, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_sparilut_set_approx_thrs | ( | magma_int_t | num_rm, |
magma_s_matrix * | LU, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_s_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | float* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_set_thrs_randomselect | ( | magma_int_t | num_rm, |
magma_s_matrix * | LU, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_s_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | float* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_set_thrs_randomselect_approx | ( | magma_int_t | num_rm, |
magma_s_matrix * | LU, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_s_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | float* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_set_thrs_randomselect_factors | ( | magma_int_t | num_rm, |
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_s_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | float* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_set_exact_thrs | ( | magma_int_t | num_rm, |
magma_s_matrix * | LU, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
This routine provides the exact threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_s_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | float* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_set_approx_thrs_inc | ( | magma_int_t | num_rm, |
magma_s_matrix * | LU, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
This routine provides the exact threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_s_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | float* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_LU_approx_thrs | ( | magma_int_t | num_rm, |
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_int_t | order, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_sparilut_reorder | ( | magma_s_matrix * | LU, |
magma_queue_t | queue ) |
This routine reorders the matrix (inplace) for easier access.
[in] | LU | magma_s_matrix* Current ILU approximation. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparict_sweep | ( | magma_s_matrix * | A, |
magma_s_matrix * | LU, | ||
magma_queue_t | queue ) |
magma_int_t magma_sparilut_zero | ( | magma_s_matrix * | A, |
magma_queue_t | queue ) |
magma_int_t magma_sparilu_sweep | ( | magma_s_matrix | A, |
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
This function does one asynchronous ParILU sweep.
Input and output array are identical.
[in] | A | magma_s_matrix System matrix in COO. |
[in] | L | magma_s_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_s_matrix* Current approximation for the upper triangular factor The format is sorted CSC (U^T in CSR). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilu_sweep_sync | ( | magma_s_matrix | A, |
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep.
Input and output are different arrays.
[in] | A | magma_s_matrix System matrix in COO. |
[in] | L | magma_s_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_s_matrix* Current approximation for the upper triangular factor The format is sorted CSC (U^T in CSR). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparic_sweep | ( | magma_s_matrix | A, |
magma_s_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one asynchronous ParILU sweep (symmetric case).
Input and output array is identical.
[in] | A | magma_s_matrix System matrix in COO. |
[in] | L | magma_s_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparic_sweep_sync | ( | magma_s_matrix | A, |
magma_s_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep (symmetric case).
Input and output are different arrays.
[in] | A | magma_s_matrix System matrix in COO. |
[in] | L | magma_s_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparict_sweep_sync | ( | magma_s_matrix * | A, |
magma_s_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep.
Input and output are different arrays.
[in] | A | magma_s_matrix* System matrix. |
[in] | L | magma_s_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_s_matrix* Current approximation for the upper triangular factor The format is sorted CSC. |
[out] | L_new | magma_s_matrix* Current approximation for the lower triangular factor The format is unsorted CSR. |
[out] | U_new | magma_s_matrix* Current approximation for the upper triangular factor The format is unsorted CSC. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_sweep_sync | ( | magma_s_matrix * | A, |
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILUT sweep.
The difference to the ParILU sweep is that the nonzero pattern of A and the incomplete factors L and U can be different. The pattern determing which elements are iterated are hence the pattern of L and U, not A.
This is the CPU version of the synchronous ParILUT sweep.
[in] | A | magma_s_matrix* System matrix. The format is sorted CSR. |
[in,out] | L | magma_s_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | U | magma_s_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_sweep_gpu | ( | magma_s_matrix * | A, |
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILUT sweep.
The difference to the ParILU sweep is that the nonzero pattern of A and the incomplete factors L and U can be different. The pattern determing which elements are iterated are hence the pattern of L and U, not A. L has a unit diagonal.
This is the GPU version of the asynchronous ParILUT sweep.
[in] | A | magma_s_matrix* System matrix. The format is sorted CSR. |
[in,out] | L | magma_s_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | U | magma_s_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_residuals_gpu | ( | magma_s_matrix | A, |
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix * | R, | ||
magma_queue_t | queue ) |
This function computes the ILU residual in the locations included in the sparsity pattern of R.
[in] | A | magma_s_matrix System matrix. The format is sorted CSR. |
[in] | L | magma_s_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | U | magma_s_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | R | magma_s_matrix* Sparsity pattern on which the ILU residual is computed. R is in COO format. On output, R contains the ILU residual. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sthrsholdrm_gpu | ( | magma_int_t | order, |
magma_s_matrix * | A, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
Purpose
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. @param[in] order magma_int_t dummy variable for now. @param[in,out] A magma_s_matrix* input/output matrix where elements are removed @param[out] thrs float* computed threshold @param[in] queue magma_queue_t Queue to execute in. @ingroup magmasparse_saux
magma_int_t magma_sget_row_ptr | ( | const magma_int_t | num_rows, |
magma_int_t * | nnz, | ||
const magma_index_t * | rowidx, | ||
magma_index_t * | rowptr, | ||
magma_queue_t | queue ) |
magma_int_t magma_sparilut_align_residuals | ( | magma_s_matrix | L, |
magma_s_matrix | U, | ||
magma_s_matrix * | Lnew, | ||
magma_s_matrix * | Unew, | ||
magma_queue_t | queue ) |
This function scales the residuals of a lower triangular factor L with the diagonal of U.
The intention is to generate a good initial guess for inserting the elements.
[in] | L | magma_s_matrix Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_s_matrix Current approximation for the upper triangular factor The format is sorted CSC. |
[in] | hL | magma_s_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | hU | magma_s_matrix* Current approximation for the upper triangular factor The format is sorted CSC. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_preselect_scale | ( | magma_s_matrix * | L, |
magma_s_matrix * | oneL, | ||
magma_s_matrix * | U, | ||
magma_s_matrix * | oneU, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_s_matrix* Matrix where elements are removed. |
[in] | U | magma_s_matrix* Matrix where elements are removed. |
[out] | oneL | magma_s_matrix* Matrix where elements are removed. |
[out] | oneU | magma_s_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_thrsrm_U | ( | magma_int_t | order, |
magma_s_matrix | L, | ||
magma_s_matrix * | A, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing.
[in] | order | magma_int_t order == 1: all elements smaller are discarded order == 0: all elements larger are discarded |
[in,out] | A | magma_s_matrix* Matrix where elements are removed. |
[in] | thrs | float* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_residuals | ( | magma_s_matrix | A, |
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix * | R, | ||
magma_queue_t | queue ) |
This function computes the ILU residual in the locations included in the sparsity pattern of R.
[in] | A | magma_s_matrix System matrix A. |
[in] | L | magma_s_matrix Current approximation for the lower triangular factor. The format is sorted CSR. |
[in] | U | magma_s_matrix Current approximation for the upper triangular factor. The format is sorted CSR. |
[in,out] | R | magma_s_matrix* Sparsity pattern on which the ILU residual is computed. R is in COO format. On output, R contains the ILU residual. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_residuals_transpose | ( | magma_s_matrix | A, |
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_s_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_s_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_residuals_semilinked | ( | magma_s_matrix | A, |
magma_s_matrix | L, | ||
magma_s_matrix | US, | ||
magma_s_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_s_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_s_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_sweep_semilinked | ( | magma_s_matrix * | A, |
magma_s_matrix * | L, | ||
magma_s_matrix * | US, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_s_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_s_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_s_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_sweep_list | ( | magma_s_matrix * | A, |
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_s_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_s_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_s_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_residuals_list | ( | magma_s_matrix | A, |
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_s_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_s_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_sweep_linkedlist | ( | magma_s_matrix * | A, |
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_s_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_s_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_s_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_residuals_linkedlist | ( | magma_s_matrix | A, |
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_s_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_s_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_colmajor | ( | magma_s_matrix | A, |
magma_s_matrix * | AC, | ||
magma_queue_t | queue ) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix.
[in] | A | magma_s_matrix The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | AC | magma_s_matrix* The matrix A but with row-pointer being for col-major, same with linked list. The values, col and row indices are unchanged. The respective pointers point to the entities of A. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_colmajorup | ( | magma_s_matrix | A, |
magma_s_matrix * | AC, | ||
magma_queue_t | queue ) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix.
[in] | A | magma_s_matrix The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | AC | magma_s_matrix* The matrix A but with row-pointer being for col-major, same with linked list. The values, col and row indices are unchanged. The respective pointers point to the entities of A. Already allocated. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparict | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the iterative threshold Incomplete Cholesky preconditioner.
The strategy is interleaving a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this new algorithm has fine-grained parallelism, and we show that it can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2016.
This function requires OpenMP, and is only available if OpenMP is activated.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparict_cpu | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold Cholesky preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern. It is the variant for SPD systems.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the iterative threshold Incomplete LU preconditioner.
The strategy is interleaving a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this new algorithm has fine-grained parallelism, and we show that it can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz constant) precond.rtol : how many candidates are added to the sparsity pattern 1.0 one per row < 1.0 a fraction of those > 1.0 all candidates
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_cpu | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_gpu | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_gpu_nodp | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
This routine is the same as magma_sparilut_gpu(), except that it uses no dynamic paralellism
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_insert | ( | magma_int_t * | num_rmL, |
magma_int_t * | num_rmU, | ||
magma_index_t * | rm_locL, | ||
magma_index_t * | rm_locU, | ||
magma_s_matrix * | L_new, | ||
magma_s_matrix * | U_new, | ||
magma_s_matrix * | L, | ||
magma_s_matrix * | U, | ||
magma_s_matrix * | UR, | ||
magma_queue_t | queue ) |
Inserts for the iterative dynamic ILU an new element in the (empty) place.
[in] | num_rmL | magma_int_t Number of Elements that are replaced in L. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced in U. |
[in] | rm_locL | magma_index_t* List containing the locations of the deleted elements. |
[in] | rm_locU | magma_index_t* List containing the locations of the deleted elements. |
[in] | L_new | magma_s_matrix Elements that will be inserted in L stored in COO format (unsorted). |
[in] | U_new | magma_s_matrix Elements that will be inserted in U stored in COO format (unsorted). |
[in,out] | L | magma_s_matrix* matrix where new elements are inserted. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_s_matrix* matrix where new elements are inserted. Row-major. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in,out] | UR | magma_s_matrix* Same matrix as U, but column-major. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_create_collinkedlist | ( | magma_s_matrix | A, |
magma_s_matrix * | B, | ||
magma_queue_t | queue ) |
For the matrix U in CSR (row-major) this creates B containing a row-ptr to the columns and a linked list for the elements.
[in] | A | magma_s_matrix Matrix to transpose. |
[out] | B | magma_s_matrix* Transposed matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_candidates | ( | magma_s_matrix | L0, |
magma_s_matrix | U0, | ||
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix * | L_new, | ||
magma_s_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_s_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_s_matrix triu( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_s_matrix Current lower triangular factor. |
[in] | U | magma_s_matrix Current upper triangular factor. |
[in,out] | LU_new | magma_s_matrix* List of candidates for L in COO format. |
[in,out] | LU_new | magma_s_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_candidates_gpu | ( | magma_s_matrix | L0, |
magma_s_matrix | U0, | ||
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix * | L_new, | ||
magma_s_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the locations with a potential nonzero ILU residual R = A - L*U where L and U are the current incomplete factors.
Nonzero ILU residuals are possible 1 where A is nonzero but L and U have no nonzero entry 2 where the product L*U has fill-in but the location is not included in L or U
We assume that the incomplete factors are exact fro the elements included in the current pattern.
This is the GPU implementation of the candidate search.
2 GPU kernels are used: the first is a dry run assessing the memory need, the second then computes the candidate locations, the third eliminates float entries. The fourth kernel ensures the elements in a row are sorted for increasing column index.
[in] | L0 | magma_s_matrix tril(ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_s_matrix triu(ILU(0) ) pattern of original system matrix. |
[in] | L | magma_s_matrix Current lower triangular factor. |
[in] | U | magma_s_matrix Current upper triangular factor. |
[in,out] | L_new | magma_s_matrix* List of candidates for L in COO format. |
[in,out] | U_new | magma_s_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparict_candidates | ( | magma_s_matrix | L0, |
magma_s_matrix | L, | ||
magma_s_matrix | LT, | ||
magma_s_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_s_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_s_matrix Current lower triangular factor. |
[in] | LT | magma_s_matrix Transose of the lower triangular factor. |
[in,out] | L_new | magma_s_matrix* List of candidates for L in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_candidates_semilinked | ( | magma_s_matrix | L0, |
magma_s_matrix | U0, | ||
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix | UT, | ||
magma_s_matrix * | L_new, | ||
magma_s_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_s_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_s_matrix triu( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_s_matrix Current lower triangular factor. |
[in] | U | magma_s_matrix Current upper triangular factor transposed. |
[in] | UR | magma_s_matrix Current upper triangular factor - col-pointer and col-list. |
[in,out] | LU_new | magma_s_matrix* List of candidates for L in COO format. |
[in,out] | LU_new | magma_s_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_candidates_linkedlist | ( | magma_s_matrix | L0, |
magma_s_matrix | U0, | ||
magma_s_matrix | L, | ||
magma_s_matrix | U, | ||
magma_s_matrix | UR, | ||
magma_s_matrix * | L_new, | ||
magma_s_matrix * | U_new, | ||
magma_queue_t | queue ) |
magma_int_t magma_sparilut_rm_thrs | ( | float * | thrs, |
magma_int_t * | num_rm, | ||
magma_s_matrix * | LU, | ||
magma_s_matrix * | LU_new, | ||
magma_index_t * | rm_loc, | ||
magma_queue_t | queue ) |
This routine removes matrix entries from the structure that are smaller than the threshold.
It only counts the elements deleted, does not save the locations.
[out] | thrs | float* Thrshold for removing elements. |
[out] | num_rm | magma_int_t* Number of Elements that have been removed. |
[in,out] | LU | magma_s_matrix* Current ILU approximation where the identified smallest components are deleted. |
[in,out] | LUC | magma_s_matrix* Corresponding col-list. |
[in,out] | LU_new | magma_s_matrix* List of candidates in COO format. |
[out] | rm_loc | magma_index_t* List containing the locations of the elements deleted. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_count | ( | magma_s_matrix | L, |
magma_int_t * | num, | ||
magma_queue_t | queue ) |
This is a helper routine counting elements in a matrix in unordered Magma_CSRLIST format.
[in] | L | magma_s_matrix* Matrix in Magm_CSRLIST format |
[out] | num | magma_index_t* Number of elements counted. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_randlist | ( | magma_s_matrix * | LU, |
magma_queue_t | queue ) |
magma_int_t magma_sparilut_select_candidates_L | ( | magma_int_t * | num_rm, |
magma_index_t * | rm_loc, | ||
magma_s_matrix * | L_new, | ||
magma_queue_t | queue ) |
Screens the new candidates for multiple elements in the same row.
We allow for at most one new element per row. This changes the algorithm, but pays off in performance.
[in] | num_rmL | magma_int_t Number of Elements that are replaced. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced. |
[in] | rm_loc | magma_int_t* Number of Elements that are replaced by distinct threads. |
[in] | L_new | magma_s_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | U_new | magma_s_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_select_candidates_U | ( | magma_int_t * | num_rm, |
magma_index_t * | rm_loc, | ||
magma_s_matrix * | L_new, | ||
magma_queue_t | queue ) |
Screens the new candidates for multiple elements in the same row.
We allow for at most one new element per row. This changes the algorithm, but pays off in performance.
[in] | num_rmL | magma_int_t Number of Elements that are replaced. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced. |
[in] | rm_loc | magma_int_t* Number of Elements that are replaced by distinct threads. |
[in] | L_new | magma_s_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | U_new | magma_s_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparilut_preselect | ( | magma_int_t | order, |
magma_s_matrix * | A, | ||
magma_s_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==0 lower triangular order==1 upper triangular |
[in] | A | magma_s_matrix* Matrix where elements are removed. |
[out] | oneA | magma_s_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_spreselect_gpu | ( | magma_int_t | order, |
magma_s_matrix * | A, | ||
magma_s_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==0 lower triangular order==1 upper triangular |
[in] | A | magma_s_matrix* Matrix where elements are removed. |
[out] | oneA | magma_s_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ssampleselect | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
float * | val, | ||
float * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest.
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | float array containing the values |
[out] | thrs | float* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ssampleselect_approx | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
float * | val, | ||
float * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest.
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | float array containing the values |
[out] | thrs | float* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ssampleselect_nodp | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
float * | val, | ||
float * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest.
This routine does not use dynamic parallelism (DP)
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | float array containing the values |
[out] | thrs | float* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ssampleselect_approx_nodp | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
float * | val, | ||
float * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest.
This routine does not use dynamic parallelism (DP)
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | float array containing the values |
[out] | thrs | float* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smprepare_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_s_matrix | L, | ||
magma_s_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
float * | trisystems, | ||
float * | rhs, | ||
magma_queue_t | queue ) |
Takes a sparse matrix and generates an array containing the sizes of the different systems an array containing the indices with the locations in the sparse matrix where the data comes from and goes back to an array containing all the sparse triangular systems.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_s_matrix Matrix in CSR format |
[in] | LC | magma_s_matrix same matrix, also CSR, but col-major |
[in,out] | sizes | magma_int_t* Number of Elements that are replaced. |
[in,out] | locations | magma_int_t* Array indicating the locations. |
[in,out] | trisystems | float* trisystems |
[in,out] | rhs | float* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smtrisolve_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_s_matrix | L, | ||
magma_s_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
float * | trisystems, | ||
float * | rhs, | ||
magma_queue_t | queue ) |
Does all triangular solves.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_s_matrix Matrix in CSR format |
[in] | LC | magma_s_matrix same matrix, also CSR, but col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | float* trisystems |
[out] | rhs | float* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smbackinsert_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_s_matrix * | M, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
float * | trisystems, | ||
float * | rhs, | ||
magma_queue_t | queue ) |
Inserts the values into the preconditioner matrix.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in,out] | M | magma_s_matrix* SPAI preconditioner CSR col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | float* trisystems |
[out] | rhs | float* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smprepare_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_s_matrix | L, | ||
magma_s_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
float * | trisystems, | ||
float * | rhs, | ||
magma_queue_t | queue ) |
magma_int_t magma_smtrisolve_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_s_matrix | L, | ||
magma_s_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
float * | trisystems, | ||
float * | rhs, | ||
magma_queue_t | queue ) |
Does all triangular solves.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_s_matrix Matrix in CSR format |
[in] | LC | magma_s_matrix same matrix, also CSR, but col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | float* trisystems |
[out] | rhs | float* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smbackinsert_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_s_matrix * | M, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
float * | trisystems, | ||
float * | rhs, | ||
magma_queue_t | queue ) |
magma_int_t magma_siluisaisetup_lower | ( | magma_s_matrix | L, |
magma_s_matrix | S, | ||
magma_s_matrix * | ISAIL, | ||
magma_queue_t | queue ) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves.
This routine only handles the lower triangular part. The return value is 0 in case of success, and Magma_CUSOLVE if the pattern is too large to be handled.
[in] | L | magma_s_matrix lower triangular factor |
[in] | S | magma_s_matrix pattern for the ISAI preconditioner for L |
[out] | ISAIL | magma_s_matrix* ISAI preconditioner for L |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_siluisaisetup_upper | ( | magma_s_matrix | U, |
magma_s_matrix | S, | ||
magma_s_matrix * | ISAIU, | ||
magma_queue_t | queue ) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves.
This routine only handles the upper triangular part. The return value is 0 in case of success, and Magma_CUSOLVE if the pattern is too large to be handled.
[in] | U | magma_s_matrix lower triangular factor |
[in] | S | magma_s_matrix pattern for the ISAI preconditioner for U |
[out] | ISAIU | magma_s_matrix* ISAI preconditioner for U |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sicisaisetup | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_sisai_l | ( | magma_s_matrix | b, |
magma_s_matrix * | x, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Left-hand-side application of ISAI preconditioner.
[in] | b | magma_s_matrix input RHS b |
[in,out] | x | magma_s_matrix solution x |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sisai_r | ( | magma_s_matrix | b, |
magma_s_matrix * | x, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Right-hand-side application of ISAI preconditioner.
[in] | b | magma_s_matrix input RHS b |
[in,out] | x | magma_s_matrix solution x |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sisai_l_t | ( | magma_s_matrix | b, |
magma_s_matrix * | x, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Left-hand-side application of ISAI preconditioner.
Transpose.
[in] | b | magma_s_matrix input RHS b |
[in,out] | x | magma_s_matrix solution x |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sisai_r_t | ( | magma_s_matrix | b, |
magma_s_matrix * | x, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Right-hand-side application of ISAI preconditioner.
Transpose.
[in] | b | magma_s_matrix input RHS b |
[in,out] | x | magma_s_matrix solution x |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smiluisai_sizecheck | ( | magma_s_matrix | A, |
magma_index_t | batchsize, | ||
magma_index_t * | maxsize, | ||
magma_queue_t | queue ) |
magma_int_t magma_sgeisai_maxblock | ( | magma_s_matrix | L, |
magma_s_matrix * | MT, | ||
magma_queue_t | queue ) |
This routine maximizes the pattern for the ISAI preconditioner.
Precisely, it computes L, L^2, L^3, L^4, L^5 and then selects the columns of M_L such that the nonzer-per-column are the lower max than the implementation-specific limit (32).
The input is the original matrix (row-major) The output is already col-major.
[in,out] | L | magma_s_matrix Incomplete factor. |
[in,out] | MT | magma_s_matrix* SPAI preconditioner structure, CSR col-major. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sisai_generator_regs | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_s_matrix | L, | ||
magma_s_matrix * | M, | ||
magma_queue_t | queue ) |
This routine is designet to combine all kernels into one.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_s_matrix triangular factor for which the ISAI matrix is computed. Col-Major CSR storage. |
[in,out] | M | magma_s_matrix* SPAI preconditioner CSR col-major |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smsupernodal | ( | magma_int_t * | max_bs, |
magma_s_matrix | A, | ||
magma_s_matrix * | S, | ||
magma_queue_t | queue ) |
Generates a block-diagonal sparsity pattern with block-size bs.
[in,out] | max_bs | magma_int_t* Size of the largest diagonal block. |
[in] | A | magma_s_matrix System matrix. |
[in,out] | S | magma_s_matrix* Generated sparsity pattern matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smvarsizeblockstruct | ( | magma_int_t | n, |
magma_int_t * | bs, | ||
magma_int_t | bsl, | ||
magma_uplo_t | uplotype, | ||
magma_s_matrix * | A, | ||
magma_queue_t | queue ) |
Generates a block-diagonal sparsity pattern with variable block-size.
[in] | n | magma_int_t Size of the matrix. |
[in] | bs | magma_int_t* Vector containing the size of the diagonal blocks. |
[in] | bsl | magma_int_t Size of the vector containing the block sizes. |
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in,out] | A | magma_s_matrix* Generated sparsity pattern matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_stfqmr_unrolled | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_matrix * | x, | ||
magma_s_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a real matrix A.
This is a GPU implementation of the transpose-free Quasi-Minimal Residual method (TFQMR).
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix RHS b |
[in,out] | x | magma_s_matrix* solution approximation |
[in,out] | solver_par | magma_s_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbicgstab_merge2 | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_matrix * | x, | ||
magma_s_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a general matrix.
This is a GPU implementation of the Biconjugate Gradient Stabilized method. The difference to magma_sbicgstab is that we use specifically designed kernels merging multiple operations into one kernel.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix RHS b |
[in,out] | x | magma_s_matrix* solution approximation |
[in,out] | solver_par | magma_s_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbicgstab_merge3 | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_matrix * | x, | ||
magma_s_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a general matrix.
This is a GPU implementation of the Biconjugate Gradient Stabilized method. The difference to magma_sbicgstab is that we use specifically designed kernels merging multiple operations into one kernel.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix RHS b |
[in,out] | x | magma_s_matrix* solution approximation |
[in,out] | solver_par | magma_s_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sjacobidomainoverlap | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_matrix * | x, | ||
magma_s_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a real symmetric N-by-N positive definite matrix A.
This is a GPU implementation of the Jacobi method allowing for domain overlap.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix RHS b |
[in,out] | x | magma_s_matrix* solution approximation |
[in,out] | solver_par | magma_s_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbaiter | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_matrix * | x, | ||
magma_s_solver_par * | solver_par, | ||
magma_s_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix RHS b |
[in,out] | x | magma_s_matrix* solution approximation |
[in,out] | solver_par | magma_s_solver_par* solver parameters |
[in] | precond_par | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbaiter_overlap | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_matrix * | x, | ||
magma_s_solver_par * | solver_par, | ||
magma_s_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU.
It used restricted additive Schwarz overlap in top-down direction.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix RHS b |
[in,out] | x | magma_s_matrix* solution approximation |
[in,out] | solver_par | magma_s_solver_par* solver parameters |
[in] | precond_par | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sftjacobicontractions | ( | magma_s_matrix | xkm2, |
magma_s_matrix | xkm1, | ||
magma_s_matrix | xk, | ||
magma_s_matrix * | z, | ||
magma_s_matrix * | c, | ||
magma_queue_t | queue ) |
Computes the contraction coefficients c_i:
c_i = z_i^{k-1} / z_i^{k}
= | x_i^{k-1} - x_i^{k-2} | / | x_i^{k} - x_i^{k-1} |
[in] | xkm2 | magma_s_matrix vector x^{k-2} |
[in] | xkm1 | magma_s_matrix vector x^{k-2} |
[in] | xk | magma_s_matrix vector x^{k-2} |
[out] | z | magma_s_matrix* ratio |
[out] | c | magma_s_matrix* contraction coefficients |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sftjacobiupdatecheck | ( | float | delta, |
magma_s_matrix * | xold, | ||
magma_s_matrix * | xnew, | ||
magma_s_matrix * | zprev, | ||
magma_s_matrix | c, | ||
magma_int_t * | flag_t, | ||
magma_int_t * | flag_fp, | ||
magma_queue_t | queue ) |
Checks the Jacobi updates accorting to the condition in the ScaLA'15 paper.
[in] | delta | float threshold |
[in,out] | xold | magma_s_matrix* vector xold |
[in,out] | xnew | magma_s_matrix* vector xnew |
[in,out] | zprev | magma_s_matrix* vector z = | x_k-1 - x_k | |
[in] | c | magma_s_matrix contraction coefficients |
[in,out] | flag_t | magma_int_t threshold condition |
[in,out] | flag_fp | magma_int_t false positive condition |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_siterref | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_matrix * | x, | ||
magma_s_solver_par * | solver_par, | ||
magma_s_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a real symmetric N-by-N positive definite matrix A.
This is a GPU implementation of the Iterative Refinement method. The inner solver is passed via the preconditioner argument.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix RHS b |
[in,out] | x | magma_s_matrix* solution approximation |
[in,out] | solver_par | magma_s_solver_par* solver parameters |
[in,out] | precond_par | magma_s_preconditioner* inner solver |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sjacobiiter_sys | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_matrix | d, | ||
magma_s_matrix | t, | ||
magma_s_matrix * | x, | ||
magma_s_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.
This routine takes the system matrix A and the RHS b as input.
[in] | A | magma_s_matrix input matrix M = D^(-1) * (L+U) |
[in] | b | magma_s_matrix input RHS b |
[in] | d | magma_s_matrix input matrix diagonal elements diag(A) |
[in] | t | magma_s_matrix temporary vector |
[in,out] | x | magma_s_matrix* iteration vector x |
[in,out] | solver_par | magma_s_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sftjacobi | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_matrix * | x, | ||
magma_s_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.
This routine takes the system matrix A and the RHS b as input. This is the fault-tolerant version of Jacobi according to ScalLA'15.
[in] | A | magma_s_matrix input matrix M = D^(-1) * (L+U) |
[in] | b | magma_s_matrix input RHS b |
[in,out] | x | magma_s_matrix* iteration vector x |
[in,out] | solver_par | magma_s_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_silut_saad | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_silut_saad_apply | ( | magma_s_matrix | b, |
magma_s_matrix * | x, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_scustomilusetup | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Reads in an Incomplete LU preconditioner.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_scustomicsetup | ( | magma_s_matrix | A, |
magma_s_matrix | b, | ||
magma_s_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Reads in an Incomplete Cholesky preconditioner.
[in] | A | magma_s_matrix input matrix A |
[in] | b | magma_s_matrix input RHS b |
[in,out] | precond | magma_s_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbajac_csr | ( | magma_int_t | localiters, |
magma_s_matrix | D, | ||
magma_s_matrix | R, | ||
magma_s_matrix | b, | ||
magma_s_matrix * | x, | ||
magma_queue_t | queue ) |
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.
Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.
[in] | localiters | magma_int_t number of local Jacobi-like updates |
[in] | D | magma_s_matrix input matrix with diagonal blocks |
[in] | R | magma_s_matrix input matrix with non-diagonal parts |
[in] | b | magma_s_matrix RHS |
[in] | x | magma_s_matrix* iterate/solution |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbajac_csr_overlap | ( | magma_int_t | localiters, |
magma_int_t | matrices, | ||
magma_int_t | overlap, | ||
magma_s_matrix * | D, | ||
magma_s_matrix * | R, | ||
magma_s_matrix | b, | ||
magma_s_matrix * | x, | ||
magma_queue_t | queue ) |
This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block.
Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.
[in] | localiters | magma_int_t number of local Jacobi-like updates |
[in] | matrices | magma_int_t number of sub-matrices |
[in] | overlap | magma_int_t size of the overlap |
[in] | D | magma_s_matrix* set of matrices with diagonal blocks |
[in] | R | magma_s_matrix* set of matrices with non-diagonal parts |
[in] | b | magma_s_matrix RHS |
[in] | x | magma_s_matrix* iterate/solution |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smlumerge | ( | magma_s_matrix | L, |
magma_s_matrix | U, | ||
magma_s_matrix * | A, | ||
magma_queue_t | queue ) |
Takes an strictly lower triangular matrix L and an upper triangular matrix U and merges them into a matrix A containing the upper and lower triangular parts.
[in] | L | magma_s_matrix input strictly lower triangular matrix L |
[in] | U | magma_s_matrix input upper triangular matrix U |
[out] | A | magma_s_matrix* output matrix |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sgeaxpy | ( | float | alpha, |
magma_s_matrix | X, | ||
float | beta, | ||
magma_s_matrix * | Y, | ||
magma_queue_t | queue ) |
This routine computes Y = alpha * X + beta * Y on the GPU.
The input format is magma_s_matrix. It can handle both, dense matrix (vector block) and CSR matrices. For the latter, it interfaces the cuSPARSE library.
[in] | alpha | float scalar multiplier. |
[in] | X | magma_s_matrix input/output matrix Y. |
[in] | beta | float scalar multiplier. |
[in,out] | Y | magma_s_matrix* input matrix X. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sgecsrreimsplit | ( | magma_s_matrix | A, |
magma_s_matrix * | ReA, | ||
magma_s_matrix * | ImA, | ||
magma_queue_t | queue ) |
This routine takes an input matrix A in CSR format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A.
The output matrices are allocated within the routine.
[in] | A | magma_s_matrix input matrix A. |
[out] | ReA | magma_s_matrix* output matrix contaning real contributions. |
[out] | ImA | magma_s_matrix* output matrix contaning real contributions. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sgedensereimsplit | ( | magma_s_matrix | A, |
magma_s_matrix * | ReA, | ||
magma_s_matrix * | ImA, | ||
magma_queue_t | queue ) |
This routine takes an input matrix A in DENSE format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A.
The output matrices are allocated within the routine.
[in] | A | magma_s_matrix input matrix A. |
[out] | ReA | magma_s_matrix* output matrix contaning real contributions. |
[out] | ImA | magma_s_matrix* output matrix contaning real contributions. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sgecsr5mv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | p, | ||
float | alpha, | ||
magma_int_t | sigma, | ||
magma_int_t | bit_y_offset, | ||
magma_int_t | bit_scansum_offset, | ||
magma_int_t | num_packet, | ||
magmaUIndex_ptr | dtile_ptr, | ||
magmaUIndex_ptr | dtile_desc, | ||
magmaIndex_ptr | dtile_desc_offset_ptr, | ||
magmaIndex_ptr | dtile_desc_offset, | ||
magmaFloat_ptr | dcalibrator, | ||
magma_int_t | tail_tile_start, | ||
magmaFloat_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaFloat_ptr | dx, | ||
float | beta, | ||
magmaFloat_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
The input format is CSR5 (val (tile-wise column-major), row_pointer, col (tile-wise column-major), tile_pointer, tile_desc).
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | p | magma_int_t number of tiles in A |
[in] | alpha | float scalar multiplier |
[in] | sigma | magma_int_t sigma in A in CSR5 |
[in] | bit_y_offset | magma_int_t bit_y_offset in A in CSR5 |
[in] | bit_scansum_offset | magma_int_t bit_scansum_offset in A in CSR5 |
[in] | num_packet | magma_int_t num_packet in A in CSR5 |
[in] | dtile_ptr | magmaUIndex_ptr tilepointer of A in CSR5 |
[in] | dtile_desc | magmaUIndex_ptr tiledescriptor of A in CSR5 |
[in] | dtile_desc_offset_ptr | magmaIndex_ptr tiledescriptor_offsetpointer of A in CSR5 |
[in] | dtile_desc_offset | magmaIndex_ptr tiledescriptor_offsetpointer of A in CSR5 |
[in] | dcalibrator | magmaFloat_ptr calibrator of A in CSR5 |
[in] | tail_tile_start | magma_int_t start of the last tile in A |
[in] | dval | magmaFloat_ptr array containing values of A in CSR |
[in] | dval | magmaFloat_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaFloat_ptr input vector x |
[in] | beta | float scalar multiplier |
[out] | dy | magmaFloat_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_scopyscale | ( | magma_int_t | n, |
magma_int_t | k, | ||
magmaFloat_ptr | r, | ||
magmaFloat_ptr | v, | ||
magmaFloat_ptr | skp, | ||
magma_queue_t | queue ) |
Computes the correction term of the pipelined GMRES according to P.
Ghysels and scales and copies the new search direction
Returns the vector v = r/ ( skp[k] - (sum_i=1^k skp[i]^2) ) .
[in] | n | int length of v_i |
[in] | k | int |
[in] | r | magmaFloat_ptr vector of length n |
[in] | v | magmaFloat_ptr vector of length n |
[in] | skp | magmaFloat_ptr array of parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_snrm2scale | ( | magma_int_t | m, |
magmaFloat_ptr | dr, | ||
magma_int_t | lddr, | ||
float * | drnorm, | ||
magma_queue_t | queue ) |
magma_int_t magma_sjacobispmvupdate_bw | ( | magma_int_t | maxiter, |
magma_s_matrix | A, | ||
magma_s_matrix | t, | ||
magma_s_matrix | b, | ||
magma_s_matrix | d, | ||
magma_s_matrix * | x, | ||
magma_queue_t | queue ) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d.
*(b-Ax) This kernel processes the thread blocks in reversed order.
[in] | maxiter | magma_int_t number of Jacobi iterations |
[in] | A | magma_s_matrix system matrix |
[in] | t | magma_s_matrix workspace |
[in] | b | magma_s_matrix RHS b |
[in] | d | magma_s_matrix vector with diagonal entries |
[out] | x | magma_s_matrix* iteration vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sjacobispmvupdateselect | ( | magma_int_t | maxiter, |
magma_int_t | num_updates, | ||
magma_index_t * | indices, | ||
magma_s_matrix | A, | ||
magma_s_matrix | t, | ||
magma_s_matrix | b, | ||
magma_s_matrix | d, | ||
magma_s_matrix | tmp, | ||
magma_s_matrix * | x, | ||
magma_queue_t | queue ) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d.
*(b-Ax)
This kernel allows for overlapping domains: the indices-array contains the locations that are updated. Locations may be repeated to simulate overlapping domains.
[in] | maxiter | magma_int_t number of Jacobi iterations |
[in] | num_updates | magma_int_t number of updates - length of the indices array |
[in] | indices | magma_index_t* indices, which entries of x to update |
[in] | A | magma_s_matrix system matrix |
[in] | t | magma_s_matrix workspace |
[in] | b | magma_s_matrix RHS b |
[in] | d | magma_s_matrix vector with diagonal entries |
[in] | tmp | magma_s_matrix workspace |
[out] | x | magma_s_matrix* iteration vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_smergeblockkrylov | ( | magma_int_t | num_rows, |
magma_int_t | num_cols, | ||
magmaFloat_ptr | alpha, | ||
magmaFloat_ptr | p, | ||
magmaFloat_ptr | x, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
v = y / rho y = y / rho w = wt / psi z = z / psi
[in] | num_rows | magma_int_t dimension m |
[in] | num_cols | magma_int_t dimension n |
[in] | alpha | magmaFloat_ptr matrix containing all SKP |
[in] | p | magmaFloat_ptr search directions |
[in,out] | x | magmaFloat_ptr approximation vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbicgmerge1 | ( | magma_int_t | n, |
magmaFloat_ptr | skp, | ||
magmaFloat_ptr | v, | ||
magmaFloat_ptr | r, | ||
magmaFloat_ptr | p, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
p = beta*p p = p-omega*beta*v p = p+r
-> p = r + beta * ( p - omega * v )
[in] | n | int dimension n |
[in] | skp | magmaFloat_ptr set of scalar parameters |
[in] | v | magmaFloat_ptr input vector v |
[in] | r | magmaFloat_ptr input vector r |
[in,out] | p | magmaFloat_ptr input/output vector p |
[in] | queue | magma_queue_t queue to execute in. |
magma_int_t magma_sbicgmerge2 | ( | magma_int_t | n, |
magmaFloat_ptr | skp, | ||
magmaFloat_ptr | r, | ||
magmaFloat_ptr | v, | ||
magmaFloat_ptr | s, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
s=r s=s-alpha*v
-> s = r - alpha * v
[in] | n | int dimension n |
[in] | skp | magmaFloat_ptr set of scalar parameters |
[in] | r | magmaFloat_ptr input vector r |
[in] | v | magmaFloat_ptr input vector v |
[out] | s | magmaFloat_ptr output vector s |
[in] | queue | magma_queue_t queue to execute in. |
magma_int_t magma_sbicgmerge3 | ( | magma_int_t | n, |
magmaFloat_ptr | skp, | ||
magmaFloat_ptr | p, | ||
magmaFloat_ptr | s, | ||
magmaFloat_ptr | t, | ||
magmaFloat_ptr | x, | ||
magmaFloat_ptr | r, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
x=x+alpha*p x=x+omega*s r=s r=r-omega*t
-> x = x + alpha * p + omega * s -> r = s - omega * t
[in] | n | int dimension n |
[in] | skp | magmaFloat_ptr set of scalar parameters |
[in] | p | magmaFloat_ptr input p |
[in] | s | magmaFloat_ptr input s |
[in] | t | magmaFloat_ptr input t |
[in,out] | x | magmaFloat_ptr input/output x |
[in,out] | r | magmaFloat_ptr input/output r |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbicgmerge4 | ( | magma_int_t | type, |
magmaFloat_ptr | skp, | ||
magma_queue_t | queue ) |
Performs some parameter operations for the BiCGSTAB with scalars on GPU.
[in] | type | int kernel type |
[in,out] | skp | magmaFloat_ptr vector with parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbicgmerge_spmv1 | ( | magma_s_matrix | A, |
magmaFloat_ptr | d1, | ||
magmaFloat_ptr | d2, | ||
magmaFloat_ptr | dp, | ||
magmaFloat_ptr | dr, | ||
magmaFloat_ptr | dv, | ||
magmaFloat_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the first SpmV using CSR with the dot product and the computation of alpha.
[in] | A | magma_s_matrix system matrix |
[in] | d1 | magmaFloat_ptr temporary vector |
[in] | d2 | magmaFloat_ptr temporary vector |
[in] | dp | magmaFloat_ptr input vector p |
[in] | dr | magmaFloat_ptr input vector r |
[in] | dv | magmaFloat_ptr output vector v |
[in,out] | skp | magmaFloat_ptr array for parameters ( skp[0]=alpha ) |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbicgmerge_spmv2 | ( | magma_s_matrix | A, |
magmaFloat_ptr | d1, | ||
magmaFloat_ptr | d2, | ||
magmaFloat_ptr | ds, | ||
magmaFloat_ptr | dt, | ||
magmaFloat_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the second SpmV using CSR with the dot product and the computation of omega.
[in] | A | magma_s_matrix input matrix |
[in] | d1 | magmaFloat_ptr temporary vector |
[in] | d2 | magmaFloat_ptr temporary vector |
[in] | ds | magmaFloat_ptr input vector s |
[in] | dt | magmaFloat_ptr output vector t |
[in,out] | skp | magmaFloat_ptr array for parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbicgmerge_xrbeta | ( | magma_int_t | n, |
magmaFloat_ptr | d1, | ||
magmaFloat_ptr | d2, | ||
magmaFloat_ptr | rr, | ||
magmaFloat_ptr | r, | ||
magmaFloat_ptr | p, | ||
magmaFloat_ptr | s, | ||
magmaFloat_ptr | t, | ||
magmaFloat_ptr | x, | ||
magmaFloat_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the second SpmV using CSR with the dot product and the computation of omega.
[in] | n | int dimension n |
[in] | d1 | magmaFloat_ptr temporary vector |
[in] | d2 | magmaFloat_ptr temporary vector |
[in] | rr | magmaFloat_ptr input vector rr |
[in] | r | magmaFloat_ptr input/output vector r |
[in] | p | magmaFloat_ptr input vector p |
[in] | s | magmaFloat_ptr input vector s |
[in] | t | magmaFloat_ptr input vector t |
[out] | x | magmaFloat_ptr output vector x |
[in] | skp | magmaFloat_ptr array for parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sbcsrswp | ( | magma_int_t | n, |
magma_int_t | size_b, | ||
magma_int_t * | ipiv, | ||
magmaFloat_ptr | dx, | ||
magma_queue_t | queue ) |
magma_int_t magma_sbcsrtrsv | ( | magma_uplo_t | uplo, |
magma_int_t | r_blocks, | ||
magma_int_t | c_blocks, | ||
magma_int_t | size_b, | ||
magmaFloat_ptr | dA, | ||
magma_index_t * | blockinfo, | ||
magmaFloat_ptr | dx, | ||
magma_queue_t | queue ) |
magma_int_t magma_sbcsrvalcpy | ( | magma_int_t | size_b, |
magma_int_t | num_blocks, | ||
magma_int_t | num_zero_blocks, | ||
magmaFloat_ptr * | dAval, | ||
magmaFloat_ptr * | dBval, | ||
magmaFloat_ptr * | dBval2, | ||
magma_queue_t | queue ) |
magma_int_t magma_sbcsrluegemm | ( | magma_int_t | size_b, |
magma_int_t | num_block_rows, | ||
magma_int_t | kblocks, | ||
magmaFloat_ptr * | dA, | ||
magmaFloat_ptr * | dB, | ||
magmaFloat_ptr * | dC, | ||
magma_queue_t | queue ) |
magma_int_t magma_sbcsrlupivloc | ( | magma_int_t | size_b, |
magma_int_t | kblocks, | ||
magmaFloat_ptr * | dA, | ||
magma_int_t * | ipiv, | ||
magma_queue_t | queue ) |
magma_int_t magma_sbcsrblockinfo5 | ( | magma_int_t | lustep, |
magma_int_t | num_blocks, | ||
magma_int_t | c_blocks, | ||
magma_int_t | size_b, | ||
magma_index_t * | blockinfo, | ||
magmaFloat_ptr | dval, | ||
magmaFloat_ptr * | AII, | ||
magma_queue_t | queue ) |
magma_int_t magma_sthrsholdselect | ( | magma_int_t | sampling, |
magma_int_t | total_size, | ||
magma_int_t | subset_size, | ||
float * | val, | ||
float * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_zorderstatistics | ( | magmaDoubleComplex * | val, |
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | r, | ||
magmaDoubleComplex * | element, | ||
magma_queue_t | queue ) |
Identifies the kth smallest/largest element in an array.
[in,out] | val | magmaDoubleComplex* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | magmaDoubleComplex* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zorderstatistics_inc | ( | magmaDoubleComplex * | val, |
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | inc, | ||
magma_int_t | r, | ||
magmaDoubleComplex * | element, | ||
magma_queue_t | queue ) |
Approximates the k-th smallest element in an array by using order-statistics with step-size inc.
[in,out] | val | magmaDoubleComplex* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | inc | magma_int_t Stepsize in the approximation. |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | magmaDoubleComplex* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmorderstatistics | ( | magmaDoubleComplex * | val, |
magma_index_t * | col, | ||
magma_index_t * | row, | ||
magma_int_t | length, | ||
magma_int_t | k, | ||
magma_int_t | r, | ||
magmaDoubleComplex * | element, | ||
magma_queue_t | queue ) |
Identifies the kth smallest/largest element in an array and reorders such that these elements come to the front.
The related arrays col and row are also reordered.
[in,out] | val | magmaDoubleComplex* Target array, will be modified during operation. |
[in,out] | col | magma_index_t* Target array, will be modified during operation. |
[in,out] | row | magma_index_t* Target array, will be modified during operation. |
[in] | length | magma_int_t Length of the target array. |
[in] | k | magma_int_t Element to be identified (largest/smallest). |
[in] | r | magma_int_t rule how to sort: '1' -> largest, '0' -> smallest |
[out] | element | magmaDoubleComplex* location of the respective element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zpartition | ( | magmaDoubleComplex * | a, |
magma_int_t | size, | ||
magma_int_t | pivot, | ||
magma_queue_t | queue ) |
magma_int_t magma_zmedian5 | ( | magmaDoubleComplex * | a, |
magma_queue_t | queue ) |
magma_int_t magma_zselect | ( | magmaDoubleComplex * | a, |
magma_int_t | size, | ||
magma_int_t | k, | ||
magma_queue_t | queue ) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm.
Derrick Coetzee, webma.nosp@m.ster.nosp@m.@moon.nosp@m.flar.nosp@m.e.com January 22, 2004 http://moonflare.com/code/select/select.pdf
[in,out] | a | magmaDoubleComplex* array to select from |
[in] | size | magma_int_t size of array |
[in] | k | magma_int_t k-th smallest element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zselectrandom | ( | magmaDoubleComplex * | a, |
magma_int_t | size, | ||
magma_int_t | k, | ||
magma_queue_t | queue ) |
An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan's worst-case linear selection algorithm.
Derrick Coetzee, webma.nosp@m.ster.nosp@m.@moon.nosp@m.flar.nosp@m.e.com January 22, 2004 http://moonflare.com/code/select/select.pdf
[in,out] | a | magmaDoubleComplex* array to select from |
[in] | size | magma_int_t size of array |
[in] | k | magma_int_t k-th smallest element |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zdomainoverlap | ( | magma_index_t | num_rows, |
magma_int_t * | num_indices, | ||
magma_index_t * | rowptr, | ||
magma_index_t * | colidx, | ||
magma_index_t * | x, | ||
magma_queue_t | queue ) |
Generates the update list.
[in] | x | magma_index_t* array to sort |
[in] | num_rows | magma_int_t number of rows in matrix |
[out] | num_indices | magma_int_t* number of indices in array |
[in] | rowptr | magma_index_t* rowpointer of matrix |
[in] | colidx | magma_index_t* colindices of matrix |
[in] | x | magma_index_t* array containing indices for domain overlap |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zvspread | ( | magma_z_matrix * | x, |
const char * | filename, | ||
magma_queue_t | queue ) |
Reads in a sparse vector-block stored in COO format.
[out] | x | magma_z_matrix * vector to read in |
[in] | filename | char* file where vector is stored |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zdiameter | ( | magma_z_matrix * | A, |
magma_queue_t | queue ) |
Computes the diameter of a sparse matrix and stores the value in diameter.
[in,out] | A | magma_z_matrix* sparse matrix |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilusetup | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the ILU preconditioner via the iterative ILU iteration.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilu_gpu | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an ILU(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the GPU implementation of the ParILU
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilu_cpu | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an ILU(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the CPU implementation of the ParILU
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparic_gpu | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an IC(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the GPU implementation of the ParIC
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparic_cpu | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an IC(0) preconditer via fixed-point iterations.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015).
This is the CPU implementation of the ParIC
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparicsetup | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the IC preconditioner via the iterative IC iteration.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparicupdate | ( | magma_z_matrix | A, |
magma_z_preconditioner * | precond, | ||
magma_int_t | updates, | ||
magma_queue_t | queue ) |
Updates an existing preconditioner via additional iterative IC sweeps for previous factorization initial guess (PFIG).
See Anzt et al., Parallel Computing, 2015.
[in] | A | magma_z_matrix input matrix A, current target system |
[in] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | updates | magma_int_t number of updates |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zapplyiteric_l | ( | magma_z_matrix | b, |
magma_z_matrix * | x, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Performs the left triangular solves using the IC preconditioner via Jacobi.
[in] | b | magma_z_matrix RHS |
[out] | x | magma_z_matrix* vector to precondition |
[in] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zapplyiteric_r | ( | magma_z_matrix | b, |
magma_z_matrix * | x, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Performs the right triangular solves using the IC preconditioner via Jacobi.
[in] | b | magma_z_matrix RHS |
[out] | x | magma_z_matrix* vector to precondition |
[in] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilu_csr | ( | magma_z_matrix | A, |
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_queue_t | queue ) |
This routine iteratively computes an incomplete LU factorization.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1-16, 2015.
The input format of the system matrix is COO, the lower triangular factor L is stored in CSR, the upper triangular factor U is transposed, then also stored in CSR (equivalent to CSC format for the non-transposed U). Every component of L and U is handled by one thread.
[in] | A | magma_z_matrix input matrix A determing initial guess & processing order |
[in,out] | L | magma_z_matrix input/output matrix L containing the lower triangular factor |
[in,out] | U | magma_z_matrix input/output matrix U containing the upper triangular factor |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zpariluupdate | ( | magma_z_matrix | A, |
magma_z_preconditioner * | precond, | ||
magma_int_t | updates, | ||
magma_queue_t | queue ) |
Updates an existing preconditioner via additional iterative ILU sweeps for previous factorization initial guess (PFIG).
See Anzt et al., Parallel Computing, 2015.
[in] | A | magma_z_matrix input matrix A, current target system |
[in] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | updates | magma_int_t number of updates |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparic_csr | ( | magma_z_matrix | A, |
magma_z_matrix | A_CSR, | ||
magma_queue_t | queue ) |
This routine iteratively computes an incomplete LU factorization.
For reference, see: E. Chow and A. Patel: "Fine-grained Parallel Incomplete LU Factorization", SIAM Journal on Scientific Computing, 37, C169-C193 (2015). This routine was used in the ISC 2015 paper: E. Chow et al.: "Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs", ISC High Performance 2015, LNCS 9137, pp. 1-16, 2015.
The input format of the initial guess matrix A is Magma_CSRCOO, A_CSR is CSR or CSRCOO format.
[in] | A | magma_z_matrix input matrix A - initial guess (lower triangular) |
[in,out] | A_CSR | magma_z_matrix input/output matrix containing the IC approximation |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_znonlinres | ( | magma_z_matrix | A, |
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix * | LU, | ||
real_Double_t * | res, | ||
magma_queue_t | queue ) |
Computes the nonlinear residual A - LU and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_z_matrix input sparse matrix in CSR |
[in] | L | magma_z_matrix input sparse matrix in CSR |
[in] | U | magma_z_matrix input sparse matrix in CSR |
[out] | LU | magma_z_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* Frobenius norm of difference |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zilures | ( | magma_z_matrix | A, |
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix * | LU, | ||
real_Double_t * | res, | ||
real_Double_t * | nonlinres, | ||
magma_queue_t | queue ) |
Computes the ILU residual A - LU and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_z_matrix input sparse matrix in CSR |
[in] | L | magma_z_matrix input sparse matrix in CSR |
[in] | U | magma_z_matrix input sparse matrix in CSR |
[out] | LU | magma_z_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* Frobenius norm of difference |
[out] | nonlinres | real_Double_t* Frobenius norm of difference |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zicres | ( | magma_z_matrix | A, |
magma_z_matrix | C, | ||
magma_z_matrix | CT, | ||
magma_z_matrix * | LU, | ||
real_Double_t * | res, | ||
real_Double_t * | nonlinres, | ||
magma_queue_t | queue ) |
Computes the IC residual A - CC^T and returns the difference as well es the Frobenius norm of the difference.
[in] | A | magma_z_matrix input sparse matrix in CSR |
[in] | C | magma_z_matrix input sparse matrix in CSR |
[in] | CT | magma_z_matrix input sparse matrix in CSR |
[in] | LU | magma_z_matrix* output sparse matrix in A-LU in CSR |
[out] | res | real_Double_t* IC residual |
[out] | nonlinres | real_Double_t* nonlinear residual |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zinitguess | ( | magma_z_matrix | A, |
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
Computes an initial guess for the ParILU/ParIC.
[in] | A | magma_z_matrix sparse matrix in CSR |
[out] | L | magma_z_matrix* sparse matrix in CSR |
[out] | U | magma_z_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zinitrecursiveLU | ( | magma_z_matrix | A, |
magma_z_matrix * | B, | ||
magma_queue_t | queue ) |
Using the iterative approach of computing ILU factorizations with increasing fill-in, it takes the input matrix A, containing the approximate factors, ( L and U as well ) computes a matrix with one higher level of fill-in, inserts the original approximation as initial guess, and provides the factors L and U also filled with the scaled initial guess.
[in] | A | magma_z_matrix* sparse matrix in CSR |
[out] | B | magma_z_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmLdiagadd | ( | magma_z_matrix * | L, |
magma_queue_t | queue ) |
Checks for a lower triangular matrix whether it is strictly lower triangular and in the negative case adds a unit diagonal.
It does this in-place.
[in,out] | L | magma_z_matrix* sparse matrix in CSR |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmatrix_cup | ( | magma_z_matrix | A, |
magma_z_matrix | B, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix \(U = A \cup B\).
If both matrices have a nonzero value in the same location, the value of A is used.
[in] | A | magma_z_matrix Input matrix 1. |
[in] | B | magma_z_matrix Input matrix 2. |
[out] | U | magma_z_matrix* Not a real matrix, but the list of all matrix entries included in either A or B. No duplicates. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmatrix_cup_gpu | ( | magma_z_matrix | A, |
magma_z_matrix | B, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix \(U = A \cup B\).
If both matrices have a nonzero value in the same location, the value of A is used.
This is the GPU version of the operation.
[in] | A | magma_z_matrix Input matrix 1. |
[in] | B | magma_z_matrix Input matrix 2. |
[out] | U | magma_z_matrix* \(U = A \cup B\). If both matrices have a nonzero value in the same location, the value of A is used. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmatrix_cap | ( | magma_z_matrix | A, |
magma_z_matrix | B, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix with entries being in both matrices: \(U = A \cap B\).
The values in U are all ones.
[in] | A | magma_z_matrix Input matrix 1. |
[in] | B | magma_z_matrix Input matrix 2. |
[out] | U | magma_z_matrix* Not a real matrix, but the list of all matrix entries included in both A and B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmatrix_negcap | ( | magma_z_matrix | A, |
magma_z_matrix | B, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a list of matrix entries being part of A but not of B.
U = A \ B The values of A are preserved.
[in] | A | magma_z_matrix Element part of this. |
[in,out] | B | magma_z_matrix Not part of this. |
[out] | U | magma_z_matrix* Not a real matrix, but the list of all matrix entries included in A not in B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmatrix_tril_negcap | ( | magma_z_matrix | A, |
magma_z_matrix | B, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a list of matrix entries being part of tril(A) but not of B.
U = tril(A) \ B The values of A are preserved.
[in] | A | magma_z_matrix Element part of this. |
[in,out] | B | magma_z_matrix Not part of this. |
[out] | U | magma_z_matrix* Not a real matrix, but the list of all matrix entries included in A not in B. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmatrix_triu_negcap | ( | magma_z_matrix | A, |
magma_z_matrix | B, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
Generates a matrix with entries being part of triu(A) but not of B.
U = triu(A) \ B The values of A are preserved.
[in] | A | magma_z_matrix Element part of this. |
[in] | B | magma_z_matrix Not part of this. |
[out] | U | magma_z_matrix* |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmatrix_abssum | ( | magma_z_matrix | A, |
double * | sum, | ||
magma_queue_t | queue ) |
Computes the sum of the absolute values in a matrix.
[in] | A | magma_z_matrix Element list/matrix. |
[out] | sum | double* Sum of the absolute values. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_thrsrm | ( | magma_int_t | order, |
magma_z_matrix * | A, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing.
[in] | order | magma_int_t order == 1: all elements smaller are discarded order == 0: all elements larger are discarded |
[in,out] | A | magma_z_matrix* Matrix where elements are removed. |
[in] | thrs | double* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_thrsrm_semilinked | ( | magma_z_matrix * | U, |
magma_z_matrix * | US, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller thrs from the matrix.
It only uses the linked list and skips the `‘removed’' elements
[in,out] | A | magma_z_matrix* Matrix where elements are removed. |
[in] | thrs | double* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_rmselected | ( | magma_z_matrix | R, |
magma_z_matrix * | A, | ||
magma_queue_t | queue ) |
Removes a selected list of elements from the matrix.
[in] | R | magma_z_matrix Matrix containing elements to be removed. |
[in,out] | A | magma_z_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_selectoneperrow | ( | magma_int_t | order, |
magma_z_matrix * | A, | ||
magma_z_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==1 -> largest order==0 -> smallest |
[in] | A | magma_z_matrix* Matrix where elements are removed. |
[out] | oneA | magma_z_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_selecttwoperrow | ( | magma_int_t | order, |
magma_z_matrix * | A, | ||
magma_z_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==1 -> largest order==0 -> smallest |
[in] | A | magma_z_matrix* Matrix where elements are removed. |
[out] | oneA | magma_z_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_selectoneperrowthrs_lower | ( | magma_z_matrix | L, |
magma_z_matrix | U, | ||
magma_z_matrix * | A, | ||
double | rtol, | ||
magma_z_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_z_matrix Current lower triangular factor. |
[in] | U | magma_z_matrix Current upper triangular factor. |
[in] | A | magma_z_matrix* All residuals in L. |
[in] | rtol | threshold rtol |
[out] | oneA | magma_z_matrix* at most one per row, if larger thrs. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_selectoneperrowthrs_upper | ( | magma_z_matrix | L, |
magma_z_matrix | U, | ||
magma_z_matrix * | A, | ||
double | rtol, | ||
magma_z_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_z_matrix Current lower triangular factor. |
[in] | U | magma_z_matrix Current upper triangular factor. |
[in] | A | magma_z_matrix* All residuals in L. |
[in] | rtol | threshold rtol |
[out] | oneA | magma_z_matrix* at most one per row, if larger thrs. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_selectonepercol | ( | magma_int_t | order, |
magma_z_matrix * | A, | ||
magma_z_matrix * | oneA, | ||
magma_queue_t | queue ) |
magma_int_t magma_zparilut_transpose_select_one | ( | magma_z_matrix | A, |
magma_z_matrix * | B, | ||
magma_queue_t | queue ) |
This is a special routine with very limited scope.
For a set of fill-in candidates in row-major format, it transposes the a submatrix, i.e. the submatrix consisting of the largest element in every column. This function is only useful for delta<=1.
[in] | A | magma_z_matrix Matrix to transpose. |
[out] | B | magma_z_matrix* Transposed matrix containing only largest elements in each col. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_insert_LU | ( | magma_int_t | num_rm, |
magma_index_t * | rm_loc, | ||
magma_index_t * | rm_loc2, | ||
magma_z_matrix * | LU_new, | ||
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
magma_int_t magma_zparilut_set_thrs | ( | magma_int_t | num_rm, |
magma_z_matrix * | LU, | ||
magma_int_t | order, | ||
magmaDoubleComplex * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_zparilut_set_approx_thrs | ( | magma_int_t | num_rm, |
magma_z_matrix * | LU, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_z_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | magmaDoubleComplex* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_set_thrs_randomselect | ( | magma_int_t | num_rm, |
magma_z_matrix * | LU, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_z_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | double* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_set_thrs_randomselect_approx | ( | magma_int_t | num_rm, |
magma_z_matrix * | LU, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_z_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | double* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_set_thrs_randomselect_factors | ( | magma_int_t | num_rm, |
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_int_t | order, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine approximates the threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_z_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | double* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_set_exact_thrs | ( | magma_int_t | num_rm, |
magma_z_matrix * | LU, | ||
magma_int_t | order, | ||
magmaDoubleComplex * | thrs, | ||
magma_queue_t | queue ) |
This routine provides the exact threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_z_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | magmaDoubleComplex* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_set_approx_thrs_inc | ( | magma_int_t | num_rm, |
magma_z_matrix * | LU, | ||
magma_int_t | order, | ||
magmaDoubleComplex * | thrs, | ||
magma_queue_t | queue ) |
This routine provides the exact threshold for removing num_rm elements.
[in] | num_rm | magma_int_t Number of Elements that are replaced. |
[in] | LU | magma_z_matrix* Current ILU approximation. |
[in] | order | magma_int_t Sort goal function: 0 = smallest, 1 = largest. |
[out] | thrs | magmaDoubleComplex* Size of the num_rm-th smallest element. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_LU_approx_thrs | ( | magma_int_t | num_rm, |
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_int_t | order, | ||
magmaDoubleComplex * | thrs, | ||
magma_queue_t | queue ) |
magma_int_t magma_zparilut_reorder | ( | magma_z_matrix * | LU, |
magma_queue_t | queue ) |
This routine reorders the matrix (inplace) for easier access.
[in] | LU | magma_z_matrix* Current ILU approximation. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparict_sweep | ( | magma_z_matrix * | A, |
magma_z_matrix * | LU, | ||
magma_queue_t | queue ) |
magma_int_t magma_zparilut_zero | ( | magma_z_matrix * | A, |
magma_queue_t | queue ) |
magma_int_t magma_zparilu_sweep | ( | magma_z_matrix | A, |
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
This function does one asynchronous ParILU sweep.
Input and output array are identical.
[in] | A | magma_z_matrix System matrix in COO. |
[in] | L | magma_z_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_z_matrix* Current approximation for the upper triangular factor The format is sorted CSC (U^T in CSR). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilu_sweep_sync | ( | magma_z_matrix | A, |
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep.
Input and output are different arrays.
[in] | A | magma_z_matrix System matrix in COO. |
[in] | L | magma_z_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_z_matrix* Current approximation for the upper triangular factor The format is sorted CSC (U^T in CSR). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparic_sweep | ( | magma_z_matrix | A, |
magma_z_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one asynchronous ParILU sweep (symmetric case).
Input and output array is identical.
[in] | A | magma_z_matrix System matrix in COO. |
[in] | L | magma_z_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparic_sweep_sync | ( | magma_z_matrix | A, |
magma_z_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep (symmetric case).
Input and output are different arrays.
[in] | A | magma_z_matrix System matrix in COO. |
[in] | L | magma_z_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparict_sweep_sync | ( | magma_z_matrix * | A, |
magma_z_matrix * | L, | ||
magma_queue_t | queue ) |
This function does one synchronized ParILU sweep.
Input and output are different arrays.
[in] | A | magma_z_matrix* System matrix. |
[in] | L | magma_z_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_z_matrix* Current approximation for the upper triangular factor The format is sorted CSC. |
[out] | L_new | magma_z_matrix* Current approximation for the lower triangular factor The format is unsorted CSR. |
[out] | U_new | magma_z_matrix* Current approximation for the upper triangular factor The format is unsorted CSC. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_sweep_sync | ( | magma_z_matrix * | A, |
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILUT sweep.
The difference to the ParILU sweep is that the nonzero pattern of A and the incomplete factors L and U can be different. The pattern determing which elements are iterated are hence the pattern of L and U, not A.
This is the CPU version of the synchronous ParILUT sweep.
[in] | A | magma_z_matrix* System matrix. The format is sorted CSR. |
[in,out] | L | magma_z_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | U | magma_z_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_sweep_gpu | ( | magma_z_matrix * | A, |
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILUT sweep.
The difference to the ParILU sweep is that the nonzero pattern of A and the incomplete factors L and U can be different. The pattern determing which elements are iterated are hence the pattern of L and U, not A. L has a unit diagonal.
This is the GPU version of the asynchronous ParILUT sweep.
[in] | A | magma_z_matrix* System matrix. The format is sorted CSR. |
[in,out] | L | magma_z_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | U | magma_z_matrix* Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_residuals_gpu | ( | magma_z_matrix | A, |
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix * | R, | ||
magma_queue_t | queue ) |
This function computes the ILU residual in the locations included in the sparsity pattern of R.
[in] | A | magma_z_matrix System matrix. The format is sorted CSR. |
[in] | L | magma_z_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in] | U | magma_z_matrix Current approximation for the lower triangular factor The format is MAGMA_CSRCOO. This is sorted CSR plus the rowindexes being stored. |
[in,out] | R | magma_z_matrix* Sparsity pattern on which the ILU residual is computed. R is in COO format. On output, R contains the ILU residual. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zthrsholdrm_gpu | ( | magma_int_t | order, |
magma_z_matrix * | A, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
Purpose
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest. @param[in] order magma_int_t dummy variable for now. @param[in,out] A magma_z_matrix* input/output matrix where elements are removed @param[out] thrs double* computed threshold @param[in] queue magma_queue_t Queue to execute in. @ingroup magmasparse_zaux
magma_int_t magma_zget_row_ptr | ( | const magma_int_t | num_rows, |
magma_int_t * | nnz, | ||
const magma_index_t * | rowidx, | ||
magma_index_t * | rowptr, | ||
magma_queue_t | queue ) |
magma_int_t magma_zparilut_align_residuals | ( | magma_z_matrix | L, |
magma_z_matrix | U, | ||
magma_z_matrix * | Lnew, | ||
magma_z_matrix * | Unew, | ||
magma_queue_t | queue ) |
This function scales the residuals of a lower triangular factor L with the diagonal of U.
The intention is to generate a good initial guess for inserting the elements.
[in] | L | magma_z_matrix Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | U | magma_z_matrix Current approximation for the upper triangular factor The format is sorted CSC. |
[in] | hL | magma_z_matrix* Current approximation for the lower triangular factor The format is sorted CSR. |
[in] | hU | magma_z_matrix* Current approximation for the upper triangular factor The format is sorted CSC. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_preselect_scale | ( | magma_z_matrix * | L, |
magma_z_matrix * | oneL, | ||
magma_z_matrix * | U, | ||
magma_z_matrix * | oneU, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | L | magma_z_matrix* Matrix where elements are removed. |
[in] | U | magma_z_matrix* Matrix where elements are removed. |
[out] | oneL | magma_z_matrix* Matrix where elements are removed. |
[out] | oneU | magma_z_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_thrsrm_U | ( | magma_int_t | order, |
magma_z_matrix | L, | ||
magma_z_matrix * | A, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
Removes any element with absolute value smaller equal or larger equal thrs from the matrix and compacts the whole thing.
[in] | order | magma_int_t order == 1: all elements smaller are discarded order == 0: all elements larger are discarded |
[in,out] | A | magma_z_matrix* Matrix where elements are removed. |
[in] | thrs | double* Threshold: all elements smaller are discarded |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_residuals | ( | magma_z_matrix | A, |
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix * | R, | ||
magma_queue_t | queue ) |
This function computes the ILU residual in the locations included in the sparsity pattern of R.
[in] | A | magma_z_matrix System matrix A. |
[in] | L | magma_z_matrix Current approximation for the lower triangular factor. The format is sorted CSR. |
[in] | U | magma_z_matrix Current approximation for the upper triangular factor. The format is sorted CSR. |
[in,out] | R | magma_z_matrix* Sparsity pattern on which the ILU residual is computed. R is in COO format. On output, R contains the ILU residual. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_residuals_transpose | ( | magma_z_matrix | A, |
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_z_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_z_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_residuals_semilinked | ( | magma_z_matrix | A, |
magma_z_matrix | L, | ||
magma_z_matrix | US, | ||
magma_z_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_z_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_z_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_sweep_semilinked | ( | magma_z_matrix * | A, |
magma_z_matrix * | L, | ||
magma_z_matrix * | US, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_z_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_z_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_z_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_sweep_list | ( | magma_z_matrix * | A, |
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_z_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_z_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_z_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_residuals_list | ( | magma_z_matrix | A, |
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_z_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_z_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_sweep_linkedlist | ( | magma_z_matrix * | A, |
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_queue_t | queue ) |
This function does an ParILU sweep.
[in,out] | A | magma_z_matrix* Current ILU approximation The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | L | magma_z_matrix* Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_z_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_residuals_linkedlist | ( | magma_z_matrix | A, |
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function computes the residuals.
[in,out] | L | magma_z_matrix Current approximation for the lower triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_z_matrix* Current approximation for the upper triangular factor The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_colmajor | ( | magma_z_matrix | A, |
magma_z_matrix * | AC, | ||
magma_queue_t | queue ) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix.
[in] | A | magma_z_matrix The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | AC | magma_z_matrix* The matrix A but with row-pointer being for col-major, same with linked list. The values, col and row indices are unchanged. The respective pointers point to the entities of A. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_colmajorup | ( | magma_z_matrix | A, |
magma_z_matrix * | AC, | ||
magma_queue_t | queue ) |
This function creates a col-pointer and a linked list along the columns for a row-major CSR matrix.
[in] | A | magma_z_matrix The format is unsorted CSR, the list array is used as linked list pointing to the respectively next entry. |
[in,out] | AC | magma_z_matrix* The matrix A but with row-pointer being for col-major, same with linked list. The values, col and row indices are unchanged. The respective pointers point to the entities of A. Already allocated. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparict | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the iterative threshold Incomplete Cholesky preconditioner.
The strategy is interleaving a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this new algorithm has fine-grained parallelism, and we show that it can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2016.
This function requires OpenMP, and is only available if OpenMP is activated.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparict_cpu | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold Cholesky preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern. It is the variant for SPD systems.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares the iterative threshold Incomplete LU preconditioner.
The strategy is interleaving a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this new algorithm has fine-grained parallelism, and we show that it can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz constant) precond.rtol : how many candidates are added to the sparsity pattern 1.0 one per row < 1.0 a fraction of those > 1.0 all candidates
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_cpu | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_gpu | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_gpu_nodp | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Generates an incomplete threshold LU preconditioner via the ParILUT algorithm.
The strategy is to interleave a parallel fixed-point iteration that approximates an incomplete factorization for a given nonzero pattern with a procedure that adaptively changes the pattern. Much of this algorithm has fine-grained parallelism, and can efficiently exploit the compute power of shared memory architectures.
This is the routine used in the publication by Anzt, Chow, Dongarra: ''ParILUT - A new parallel threshold ILU factorization'' submitted to SIAM SISC in 2017.
This version uses the default setting which adds all candidates to the sparsity pattern.
This function requires OpenMP, and is only available if OpenMP is activated.
The parameter list is:
precond.sweeps : number of ParILUT steps precond.atol : absolute fill ratio (1.0 keeps nnz count constant)
This routine is the same as magma_zparilut_gpu(), except that it uses no dynamic paralellism
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_insert | ( | magma_int_t * | num_rmL, |
magma_int_t * | num_rmU, | ||
magma_index_t * | rm_locL, | ||
magma_index_t * | rm_locU, | ||
magma_z_matrix * | L_new, | ||
magma_z_matrix * | U_new, | ||
magma_z_matrix * | L, | ||
magma_z_matrix * | U, | ||
magma_z_matrix * | UR, | ||
magma_queue_t | queue ) |
Inserts for the iterative dynamic ILU an new element in the (empty) place.
[in] | num_rmL | magma_int_t Number of Elements that are replaced in L. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced in U. |
[in] | rm_locL | magma_index_t* List containing the locations of the deleted elements. |
[in] | rm_locU | magma_index_t* List containing the locations of the deleted elements. |
[in] | L_new | magma_z_matrix Elements that will be inserted in L stored in COO format (unsorted). |
[in] | U_new | magma_z_matrix Elements that will be inserted in U stored in COO format (unsorted). |
[in,out] | L | magma_z_matrix* matrix where new elements are inserted. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in,out] | U | magma_z_matrix* matrix where new elements are inserted. Row-major. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in,out] | UR | magma_z_matrix* Same matrix as U, but column-major. The format is unsorted CSR, list is used as linked list pointing to the respectively next entry. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_create_collinkedlist | ( | magma_z_matrix | A, |
magma_z_matrix * | B, | ||
magma_queue_t | queue ) |
For the matrix U in CSR (row-major) this creates B containing a row-ptr to the columns and a linked list for the elements.
[in] | A | magma_z_matrix Matrix to transpose. |
[out] | B | magma_z_matrix* Transposed matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_candidates | ( | magma_z_matrix | L0, |
magma_z_matrix | U0, | ||
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix * | L_new, | ||
magma_z_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_z_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_z_matrix triu( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_z_matrix Current lower triangular factor. |
[in] | U | magma_z_matrix Current upper triangular factor. |
[in,out] | LU_new | magma_z_matrix* List of candidates for L in COO format. |
[in,out] | LU_new | magma_z_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_candidates_gpu | ( | magma_z_matrix | L0, |
magma_z_matrix | U0, | ||
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix * | L_new, | ||
magma_z_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the locations with a potential nonzero ILU residual R = A - L*U where L and U are the current incomplete factors.
Nonzero ILU residuals are possible 1 where A is nonzero but L and U have no nonzero entry 2 where the product L*U has fill-in but the location is not included in L or U
We assume that the incomplete factors are exact fro the elements included in the current pattern.
This is the GPU implementation of the candidate search.
2 GPU kernels are used: the first is a dry run assessing the memory need, the second then computes the candidate locations, the third eliminates double entries. The fourth kernel ensures the elements in a row are sorted for increasing column index.
[in] | L0 | magma_z_matrix tril(ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_z_matrix triu(ILU(0) ) pattern of original system matrix. |
[in] | L | magma_z_matrix Current lower triangular factor. |
[in] | U | magma_z_matrix Current upper triangular factor. |
[in,out] | L_new | magma_z_matrix* List of candidates for L in COO format. |
[in,out] | U_new | magma_z_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparict_candidates | ( | magma_z_matrix | L0, |
magma_z_matrix | L, | ||
magma_z_matrix | LT, | ||
magma_z_matrix * | L_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_z_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_z_matrix Current lower triangular factor. |
[in] | LT | magma_z_matrix Transose of the lower triangular factor. |
[in,out] | L_new | magma_z_matrix* List of candidates for L in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_candidates_semilinked | ( | magma_z_matrix | L0, |
magma_z_matrix | U0, | ||
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix | UT, | ||
magma_z_matrix * | L_new, | ||
magma_z_matrix * | U_new, | ||
magma_queue_t | queue ) |
This function identifies the candidates like they appear as ILU1 fill-in.
In this version, the matrices are assumed unordered, the linked list is traversed to acces the entries of a row.
[in] | L0 | magma_z_matrix tril( ILU(0) ) pattern of original system matrix. |
[in] | U0 | magma_z_matrix triu( ILU(0) ) pattern of original system matrix. |
[in] | L | magma_z_matrix Current lower triangular factor. |
[in] | U | magma_z_matrix Current upper triangular factor transposed. |
[in] | UR | magma_z_matrix Current upper triangular factor - col-pointer and col-list. |
[in,out] | LU_new | magma_z_matrix* List of candidates for L in COO format. |
[in,out] | LU_new | magma_z_matrix* List of candidates for U in COO format. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_candidates_linkedlist | ( | magma_z_matrix | L0, |
magma_z_matrix | U0, | ||
magma_z_matrix | L, | ||
magma_z_matrix | U, | ||
magma_z_matrix | UR, | ||
magma_z_matrix * | L_new, | ||
magma_z_matrix * | U_new, | ||
magma_queue_t | queue ) |
magma_int_t magma_zparilut_rm_thrs | ( | double * | thrs, |
magma_int_t * | num_rm, | ||
magma_z_matrix * | LU, | ||
magma_z_matrix * | LU_new, | ||
magma_index_t * | rm_loc, | ||
magma_queue_t | queue ) |
This routine removes matrix entries from the structure that are smaller than the threshold.
It only counts the elements deleted, does not save the locations.
[out] | thrs | magmaDoubleComplex* Thrshold for removing elements. |
[out] | num_rm | magma_int_t* Number of Elements that have been removed. |
[in,out] | LU | magma_z_matrix* Current ILU approximation where the identified smallest components are deleted. |
[in,out] | LUC | magma_z_matrix* Corresponding col-list. |
[in,out] | LU_new | magma_z_matrix* List of candidates in COO format. |
[out] | rm_loc | magma_index_t* List containing the locations of the elements deleted. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_count | ( | magma_z_matrix | L, |
magma_int_t * | num, | ||
magma_queue_t | queue ) |
This is a helper routine counting elements in a matrix in unordered Magma_CSRLIST format.
[in] | L | magma_z_matrix* Matrix in Magm_CSRLIST format |
[out] | num | magma_index_t* Number of elements counted. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_randlist | ( | magma_z_matrix * | LU, |
magma_queue_t | queue ) |
magma_int_t magma_zparilut_select_candidates_L | ( | magma_int_t * | num_rm, |
magma_index_t * | rm_loc, | ||
magma_z_matrix * | L_new, | ||
magma_queue_t | queue ) |
Screens the new candidates for multiple elements in the same row.
We allow for at most one new element per row. This changes the algorithm, but pays off in performance.
[in] | num_rmL | magma_int_t Number of Elements that are replaced. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced. |
[in] | rm_loc | magma_int_t* Number of Elements that are replaced by distinct threads. |
[in] | L_new | magma_z_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | U_new | magma_z_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_select_candidates_U | ( | magma_int_t * | num_rm, |
magma_index_t * | rm_loc, | ||
magma_z_matrix * | L_new, | ||
magma_queue_t | queue ) |
Screens the new candidates for multiple elements in the same row.
We allow for at most one new element per row. This changes the algorithm, but pays off in performance.
[in] | num_rmL | magma_int_t Number of Elements that are replaced. |
[in] | num_rmU | magma_int_t Number of Elements that are replaced. |
[in] | rm_loc | magma_int_t* Number of Elements that are replaced by distinct threads. |
[in] | L_new | magma_z_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | U_new | magma_z_matrix* Elements that will be inserted stored in COO format (unsorted). |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zparilut_preselect | ( | magma_int_t | order, |
magma_z_matrix * | A, | ||
magma_z_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==0 lower triangular order==1 upper triangular |
[in] | A | magma_z_matrix* Matrix where elements are removed. |
[out] | oneA | magma_z_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zpreselect_gpu | ( | magma_int_t | order, |
magma_z_matrix * | A, | ||
magma_z_matrix * | oneA, | ||
magma_queue_t | queue ) |
This function takes a list of candidates with residuals, and selects the largest in every row.
The output matrix only contains these largest elements (respectively a zero element if there is no candidate for a certain row).
[in] | order | magma_int_t order==0 lower triangular order==1 upper triangular |
[in] | A | magma_z_matrix* Matrix where elements are removed. |
[out] | oneA | magma_z_matrix* Matrix where elements are removed. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zsampleselect | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
magmaDoubleComplex * | val, | ||
double * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest.
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | magmaDoubleComplex array containing the values |
[out] | thrs | double* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zsampleselect_approx | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
magmaDoubleComplex * | val, | ||
double * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest.
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | magmaDoubleComplex array containing the values |
[out] | thrs | double* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zsampleselect_nodp | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
magmaDoubleComplex * | val, | ||
double * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest.
This routine does not use dynamic parallelism (DP)
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | magmaDoubleComplex array containing the values |
[out] | thrs | double* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zsampleselect_approx_nodp | ( | magma_int_t | total_size, |
magma_int_t | subset_size, | ||
magmaDoubleComplex * | val, | ||
double * | thrs, | ||
magma_ptr * | tmp_ptr, | ||
magma_int_t * | tmp_size, | ||
magma_queue_t | queue ) |
This routine selects an approximate threshold separating the subset_size smallest magnitude elements from the rest.
This routine does not use dynamic parallelism (DP)
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | magmaDoubleComplex array containing the values |
[out] | thrs | double* computed threshold |
[in,out] | tmp_ptr | magma_ptr* pointer to pointer to temporary storage. May be reallocated during execution. |
[in,out] | tmp_size | magma_int_t* pointer to size of temporary storage. May be increased during execution. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmprepare_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_z_matrix | L, | ||
magma_z_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaDoubleComplex * | trisystems, | ||
magmaDoubleComplex * | rhs, | ||
magma_queue_t | queue ) |
Takes a sparse matrix and generates an array containing the sizes of the different systems an array containing the indices with the locations in the sparse matrix where the data comes from and goes back to an array containing all the sparse triangular systems.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_z_matrix Matrix in CSR format |
[in] | LC | magma_z_matrix same matrix, also CSR, but col-major |
[in,out] | sizes | magma_int_t* Number of Elements that are replaced. |
[in,out] | locations | magma_int_t* Array indicating the locations. |
[in,out] | trisystems | magmaDoubleComplex* trisystems |
[in,out] | rhs | magmaDoubleComplex* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmtrisolve_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_z_matrix | L, | ||
magma_z_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaDoubleComplex * | trisystems, | ||
magmaDoubleComplex * | rhs, | ||
magma_queue_t | queue ) |
Does all triangular solves.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_z_matrix Matrix in CSR format |
[in] | LC | magma_z_matrix same matrix, also CSR, but col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | magmaDoubleComplex* trisystems |
[out] | rhs | magmaDoubleComplex* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmbackinsert_batched | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_z_matrix * | M, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaDoubleComplex * | trisystems, | ||
magmaDoubleComplex * | rhs, | ||
magma_queue_t | queue ) |
Inserts the values into the preconditioner matrix.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in,out] | M | magma_z_matrix* SPAI preconditioner CSR col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | magmaDoubleComplex* trisystems |
[out] | rhs | magmaDoubleComplex* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmprepare_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_z_matrix | L, | ||
magma_z_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaDoubleComplex * | trisystems, | ||
magmaDoubleComplex * | rhs, | ||
magma_queue_t | queue ) |
This routine prepares the batch of small triangular systems that need to be solved for computing the ISAI preconditioner.
[in] | uplotype | magma_uplo_t input matrix |
[in] | transtype | magma_trans_t input matrix |
[in] | diagtype | magma_diag_t input matrix |
[in] | L | magma_z_matrix triangular factor for which the ISAI matrix is computed. Col-Major CSR storage. |
[in] | LC | magma_z_matrix sparsity pattern of the ISAI matrix. Col-Major CSR storage. |
[in,out] | sizes | magma_index_t* array containing the sizes of the small triangular systems |
[in,out] | locations | magma_index_t* array containing the locations in the respective column of L |
[in,out] | trisystems | magmaDoubleComplex* batch of generated small triangular systems. All systems are embedded in uniform memory blocks of size BLOCKSIZE x BLOCKSIZE |
[in,out] | rhs | magmaDoubleComplex* RHS of the small triangular systems |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmtrisolve_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_z_matrix | L, | ||
magma_z_matrix | LC, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaDoubleComplex * | trisystems, | ||
magmaDoubleComplex * | rhs, | ||
magma_queue_t | queue ) |
Does all triangular solves.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_z_matrix Matrix in CSR format |
[in] | LC | magma_z_matrix same matrix, also CSR, but col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | magmaDoubleComplex* trisystems |
[out] | rhs | magmaDoubleComplex* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmbackinsert_batched_gpu | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_z_matrix * | M, | ||
magma_index_t * | sizes, | ||
magma_index_t * | locations, | ||
magmaDoubleComplex * | trisystems, | ||
magmaDoubleComplex * | rhs, | ||
magma_queue_t | queue ) |
Inserts the values into the preconditioner matrix.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in,out] | M | magma_z_matrix* SPAI preconditioner CSR col-major |
[out] | sizes | magma_int_t* Number of Elements that are replaced. |
[out] | locations | magma_int_t* Array indicating the locations. |
[out] | trisystems | magmaDoubleComplex* trisystems |
[out] | rhs | magmaDoubleComplex* right-hand sides |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ziluisaisetup_lower | ( | magma_z_matrix | L, |
magma_z_matrix | S, | ||
magma_z_matrix * | ISAIL, | ||
magma_queue_t | queue ) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves.
This routine only handles the lower triangular part. The return value is 0 in case of success, and Magma_CUSOLVE if the pattern is too large to be handled.
[in] | L | magma_z_matrix lower triangular factor |
[in] | S | magma_z_matrix pattern for the ISAI preconditioner for L |
[out] | ISAIL | magma_z_matrix* ISAI preconditioner for L |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ziluisaisetup_upper | ( | magma_z_matrix | U, |
magma_z_matrix | S, | ||
magma_z_matrix * | ISAIU, | ||
magma_queue_t | queue ) |
Prepares Incomplete LU preconditioner using a sparse approximate inverse instead of sparse triangular solves.
This routine only handles the upper triangular part. The return value is 0 in case of success, and Magma_CUSOLVE if the pattern is too large to be handled.
[in] | U | magma_z_matrix lower triangular factor |
[in] | S | magma_z_matrix pattern for the ISAI preconditioner for U |
[out] | ISAIU | magma_z_matrix* ISAI preconditioner for U |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zicisaisetup | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Prepares Incomplete Cholesky preconditioner using a sparse approximate inverse instead of sparse triangular solves.
This is the symmetric variant of zgeisai.cpp.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zisai_l | ( | magma_z_matrix | b, |
magma_z_matrix * | x, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Left-hand-side application of ISAI preconditioner.
[in] | b | magma_z_matrix input RHS b |
[in,out] | x | magma_z_matrix solution x |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zisai_r | ( | magma_z_matrix | b, |
magma_z_matrix * | x, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Right-hand-side application of ISAI preconditioner.
[in] | b | magma_z_matrix input RHS b |
[in,out] | x | magma_z_matrix solution x |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zisai_l_t | ( | magma_z_matrix | b, |
magma_z_matrix * | x, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Left-hand-side application of ISAI preconditioner.
Transpose.
[in] | b | magma_z_matrix input RHS b |
[in,out] | x | magma_z_matrix solution x |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zisai_r_t | ( | magma_z_matrix | b, |
magma_z_matrix * | x, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Right-hand-side application of ISAI preconditioner.
Transpose.
[in] | b | magma_z_matrix input RHS b |
[in,out] | x | magma_z_matrix solution x |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmiluisai_sizecheck | ( | magma_z_matrix | A, |
magma_index_t | batchsize, | ||
magma_index_t * | maxsize, | ||
magma_queue_t | queue ) |
magma_int_t magma_zgeisai_maxblock | ( | magma_z_matrix | L, |
magma_z_matrix * | MT, | ||
magma_queue_t | queue ) |
This routine maximizes the pattern for the ISAI preconditioner.
Precisely, it computes L, L^2, L^3, L^4, L^5 and then selects the columns of M_L such that the nonzer-per-column are the lower max than the implementation-specific limit (32).
The input is the original matrix (row-major) The output is already col-major.
[in,out] | L | magma_z_matrix Incomplete factor. |
[in,out] | MT | magma_z_matrix* SPAI preconditioner structure, CSR col-major. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zisai_generator_regs | ( | magma_uplo_t | uplotype, |
magma_trans_t | transtype, | ||
magma_diag_t | diagtype, | ||
magma_z_matrix | L, | ||
magma_z_matrix * | M, | ||
magma_queue_t | queue ) |
This routine is designet to combine all kernels into one.
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in] | transtype | magma_trans_t possibility for transposed matrix |
[in] | diagtype | magma_diag_t unit diagonal or not |
[in] | L | magma_z_matrix triangular factor for which the ISAI matrix is computed. Col-Major CSR storage. |
[in,out] | M | magma_z_matrix* SPAI preconditioner CSR col-major |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmsupernodal | ( | magma_int_t * | max_bs, |
magma_z_matrix | A, | ||
magma_z_matrix * | S, | ||
magma_queue_t | queue ) |
Generates a block-diagonal sparsity pattern with block-size bs.
[in,out] | max_bs | magma_int_t* Size of the largest diagonal block. |
[in] | A | magma_z_matrix System matrix. |
[in,out] | S | magma_z_matrix* Generated sparsity pattern matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmvarsizeblockstruct | ( | magma_int_t | n, |
magma_int_t * | bs, | ||
magma_int_t | bsl, | ||
magma_uplo_t | uplotype, | ||
magma_z_matrix * | A, | ||
magma_queue_t | queue ) |
Generates a block-diagonal sparsity pattern with variable block-size.
[in] | n | magma_int_t Size of the matrix. |
[in] | bs | magma_int_t* Vector containing the size of the diagonal blocks. |
[in] | bsl | magma_int_t Size of the vector containing the block sizes. |
[in] | uplotype | magma_uplo_t lower or upper triangular |
[in,out] | A | magma_z_matrix* Generated sparsity pattern matrix. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ztfqmr_unrolled | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_matrix * | x, | ||
magma_z_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a complex matrix A.
This is a GPU implementation of the transpose-free Quasi-Minimal Residual method (TFQMR).
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix RHS b |
[in,out] | x | magma_z_matrix* solution approximation |
[in,out] | solver_par | magma_z_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbicgstab_merge2 | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_matrix * | x, | ||
magma_z_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a general matrix.
This is a GPU implementation of the Biconjugate Gradient Stabilized method. The difference to magma_zbicgstab is that we use specifically designed kernels merging multiple operations into one kernel.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix RHS b |
[in,out] | x | magma_z_matrix* solution approximation |
[in,out] | solver_par | magma_z_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbicgstab_merge3 | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_matrix * | x, | ||
magma_z_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a general matrix.
This is a GPU implementation of the Biconjugate Gradient Stabilized method. The difference to magma_zbicgstab is that we use specifically designed kernels merging multiple operations into one kernel.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix RHS b |
[in,out] | x | magma_z_matrix* solution approximation |
[in,out] | solver_par | magma_z_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zjacobidomainoverlap | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_matrix * | x, | ||
magma_z_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a complex Hermitian N-by-N positive definite matrix A.
This is a GPU implementation of the Jacobi method allowing for domain overlap.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix RHS b |
[in,out] | x | magma_z_matrix* solution approximation |
[in,out] | solver_par | magma_z_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbaiter | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_matrix * | x, | ||
magma_z_solver_par * | solver_par, | ||
magma_z_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix RHS b |
[in,out] | x | magma_z_matrix* solution approximation |
[in,out] | solver_par | magma_z_solver_par* solver parameters |
[in] | precond_par | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbaiter_overlap | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_matrix * | x, | ||
magma_z_solver_par * | solver_par, | ||
magma_z_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * x = b via the block asynchronous iteration method on GPU.
It used restricted additive Schwarz overlap in top-down direction.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix RHS b |
[in,out] | x | magma_z_matrix* solution approximation |
[in,out] | solver_par | magma_z_solver_par* solver parameters |
[in] | precond_par | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zftjacobicontractions | ( | magma_z_matrix | xkm2, |
magma_z_matrix | xkm1, | ||
magma_z_matrix | xk, | ||
magma_z_matrix * | z, | ||
magma_z_matrix * | c, | ||
magma_queue_t | queue ) |
Computes the contraction coefficients c_i:
c_i = z_i^{k-1} / z_i^{k}
= | x_i^{k-1} - x_i^{k-2} | / | x_i^{k} - x_i^{k-1} |
[in] | xkm2 | magma_z_matrix vector x^{k-2} |
[in] | xkm1 | magma_z_matrix vector x^{k-2} |
[in] | xk | magma_z_matrix vector x^{k-2} |
[out] | z | magma_z_matrix* ratio |
[out] | c | magma_z_matrix* contraction coefficients |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zftjacobiupdatecheck | ( | double | delta, |
magma_z_matrix * | xold, | ||
magma_z_matrix * | xnew, | ||
magma_z_matrix * | zprev, | ||
magma_z_matrix | c, | ||
magma_int_t * | flag_t, | ||
magma_int_t * | flag_fp, | ||
magma_queue_t | queue ) |
Checks the Jacobi updates accorting to the condition in the ScaLA'15 paper.
[in] | delta | double threshold |
[in,out] | xold | magma_z_matrix* vector xold |
[in,out] | xnew | magma_z_matrix* vector xnew |
[in,out] | zprev | magma_z_matrix* vector z = | x_k-1 - x_k | |
[in] | c | magma_z_matrix contraction coefficients |
[in,out] | flag_t | magma_int_t threshold condition |
[in,out] | flag_fp | magma_int_t false positive condition |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_ziterref | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_matrix * | x, | ||
magma_z_solver_par * | solver_par, | ||
magma_z_preconditioner * | precond_par, | ||
magma_queue_t | queue ) |
Solves a system of linear equations A * X = B where A is a complex Hermitian N-by-N positive definite matrix A.
This is a GPU implementation of the Iterative Refinement method. The inner solver is passed via the preconditioner argument.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix RHS b |
[in,out] | x | magma_z_matrix* solution approximation |
[in,out] | solver_par | magma_z_solver_par* solver parameters |
[in,out] | precond_par | magma_z_preconditioner* inner solver |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zjacobiiter_sys | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_matrix | d, | ||
magma_z_matrix | t, | ||
magma_z_matrix * | x, | ||
magma_z_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.
This routine takes the system matrix A and the RHS b as input.
[in] | A | magma_z_matrix input matrix M = D^(-1) * (L+U) |
[in] | b | magma_z_matrix input RHS b |
[in] | d | magma_z_matrix input matrix diagonal elements diag(A) |
[in] | t | magma_z_matrix temporary vector |
[in,out] | x | magma_z_matrix* iteration vector x |
[in,out] | solver_par | magma_z_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zftjacobi | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_matrix * | x, | ||
magma_z_solver_par * | solver_par, | ||
magma_queue_t | queue ) |
Iterates the solution approximation according to x^(k+1) = D^(-1) * b - D^(-1) * (L+U) * x^k x^(k+1) = c - M * x^k.
This routine takes the system matrix A and the RHS b as input. This is the fault-tolerant version of Jacobi according to ScalLA'15.
[in] | A | magma_z_matrix input matrix M = D^(-1) * (L+U) |
[in] | b | magma_z_matrix input RHS b |
[in,out] | x | magma_z_matrix* iteration vector x |
[in,out] | solver_par | magma_z_solver_par* solver parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zilut_saad | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_zilut_saad_apply | ( | magma_z_matrix | b, |
magma_z_matrix * | x, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
magma_int_t magma_zcustomilusetup | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Reads in an Incomplete LU preconditioner.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zcustomicsetup | ( | magma_z_matrix | A, |
magma_z_matrix | b, | ||
magma_z_preconditioner * | precond, | ||
magma_queue_t | queue ) |
Reads in an Incomplete Cholesky preconditioner.
[in] | A | magma_z_matrix input matrix A |
[in] | b | magma_z_matrix input RHS b |
[in,out] | precond | magma_z_preconditioner* preconditioner parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbajac_csr | ( | magma_int_t | localiters, |
magma_z_matrix | D, | ||
magma_z_matrix | R, | ||
magma_z_matrix | b, | ||
magma_z_matrix * | x, | ||
magma_queue_t | queue ) |
This routine is a block-asynchronous Jacobi iteration performing s local Jacobi-updates within the block.
Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.
[in] | localiters | magma_int_t number of local Jacobi-like updates |
[in] | D | magma_z_matrix input matrix with diagonal blocks |
[in] | R | magma_z_matrix input matrix with non-diagonal parts |
[in] | b | magma_z_matrix RHS |
[in] | x | magma_z_matrix* iterate/solution |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbajac_csr_overlap | ( | magma_int_t | localiters, |
magma_int_t | matrices, | ||
magma_int_t | overlap, | ||
magma_z_matrix * | D, | ||
magma_z_matrix * | R, | ||
magma_z_matrix | b, | ||
magma_z_matrix * | x, | ||
magma_queue_t | queue ) |
This routine is a block-asynchronous Jacobi iteration with directed restricted additive Schwarz overlap (top-down) performing s local Jacobi-updates within the block.
Input format is two CSR matrices, one containing the diagonal blocks, one containing the rest.
[in] | localiters | magma_int_t number of local Jacobi-like updates |
[in] | matrices | magma_int_t number of sub-matrices |
[in] | overlap | magma_int_t size of the overlap |
[in] | D | magma_z_matrix* set of matrices with diagonal blocks |
[in] | R | magma_z_matrix* set of matrices with non-diagonal parts |
[in] | b | magma_z_matrix RHS |
[in] | x | magma_z_matrix* iterate/solution |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmlumerge | ( | magma_z_matrix | L, |
magma_z_matrix | U, | ||
magma_z_matrix * | A, | ||
magma_queue_t | queue ) |
Takes an strictly lower triangular matrix L and an upper triangular matrix U and merges them into a matrix A containing the upper and lower triangular parts.
[in] | L | magma_z_matrix input strictly lower triangular matrix L |
[in] | U | magma_z_matrix input upper triangular matrix U |
[out] | A | magma_z_matrix* output matrix |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgeaxpy | ( | magmaDoubleComplex | alpha, |
magma_z_matrix | X, | ||
magmaDoubleComplex | beta, | ||
magma_z_matrix * | Y, | ||
magma_queue_t | queue ) |
This routine computes Y = alpha * X + beta * Y on the GPU.
The input format is magma_z_matrix. It can handle both, dense matrix (vector block) and CSR matrices. For the latter, it interfaces the cuSPARSE library.
[in] | alpha | magmaDoubleComplex scalar multiplier. |
[in] | X | magma_z_matrix input/output matrix Y. |
[in] | beta | magmaDoubleComplex scalar multiplier. |
[in,out] | Y | magma_z_matrix* input matrix X. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgecsrreimsplit | ( | magma_z_matrix | A, |
magma_z_matrix * | ReA, | ||
magma_z_matrix * | ImA, | ||
magma_queue_t | queue ) |
This routine takes an input matrix A in CSR format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A.
The output matrices are allocated within the routine.
[in] | A | magma_z_matrix input matrix A. |
[out] | ReA | magma_z_matrix* output matrix contaning real contributions. |
[out] | ImA | magma_z_matrix* output matrix contaning complex contributions. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgedensereimsplit | ( | magma_z_matrix | A, |
magma_z_matrix * | ReA, | ||
magma_z_matrix * | ImA, | ||
magma_queue_t | queue ) |
This routine takes an input matrix A in DENSE format and located on the GPU and splits it into two matrixes ReA and ImA containing the real and the imaginary contributions of A.
The output matrices are allocated within the routine.
[in] | A | magma_z_matrix input matrix A. |
[out] | ReA | magma_z_matrix* output matrix contaning real contributions. |
[out] | ImA | magma_z_matrix* output matrix contaning complex contributions. |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zgecsr5mv | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magma_int_t | p, | ||
magmaDoubleComplex | alpha, | ||
magma_int_t | sigma, | ||
magma_int_t | bit_y_offset, | ||
magma_int_t | bit_scansum_offset, | ||
magma_int_t | num_packet, | ||
magmaUIndex_ptr | dtile_ptr, | ||
magmaUIndex_ptr | dtile_desc, | ||
magmaIndex_ptr | dtile_desc_offset_ptr, | ||
magmaIndex_ptr | dtile_desc_offset, | ||
magmaDoubleComplex_ptr | dcalibrator, | ||
magma_int_t | tail_tile_start, | ||
magmaDoubleComplex_ptr | dval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
The input format is CSR5 (val (tile-wise column-major), row_pointer, col (tile-wise column-major), tile_pointer, tile_desc).
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | p | magma_int_t number of tiles in A |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | sigma | magma_int_t sigma in A in CSR5 |
[in] | bit_y_offset | magma_int_t bit_y_offset in A in CSR5 |
[in] | bit_scansum_offset | magma_int_t bit_scansum_offset in A in CSR5 |
[in] | num_packet | magma_int_t num_packet in A in CSR5 |
[in] | dtile_ptr | magmaUIndex_ptr tilepointer of A in CSR5 |
[in] | dtile_desc | magmaUIndex_ptr tiledescriptor of A in CSR5 |
[in] | dtile_desc_offset_ptr | magmaIndex_ptr tiledescriptor_offsetpointer of A in CSR5 |
[in] | dtile_desc_offset | magmaIndex_ptr tiledescriptor_offsetpointer of A in CSR5 |
[in] | dcalibrator | magmaDoubleComplex_ptr calibrator of A in CSR5 |
[in] | tail_tile_start | magma_int_t start of the last tile in A |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in CSR |
[in] | dval | magmaDoubleComplex_ptr array containing values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zcopyscale | ( | magma_int_t | n, |
magma_int_t | k, | ||
magmaDoubleComplex_ptr | r, | ||
magmaDoubleComplex_ptr | v, | ||
magmaDoubleComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Computes the correction term of the pipelined GMRES according to P.
Ghysels and scales and copies the new search direction
Returns the vector v = r/ ( skp[k] - (sum_i=1^k skp[i]^2) ) .
[in] | n | int length of v_i |
[in] | k | int |
[in] | r | magmaDoubleComplex_ptr vector of length n |
[in] | v | magmaDoubleComplex_ptr vector of length n |
[in] | skp | magmaDoubleComplex_ptr array of parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_dznrm2scale | ( | magma_int_t | m, |
magmaDoubleComplex_ptr | dr, | ||
magma_int_t | lddr, | ||
magmaDoubleComplex * | drnorm, | ||
magma_queue_t | queue ) |
magma_int_t magma_zjacobispmvupdate_bw | ( | magma_int_t | maxiter, |
magma_z_matrix | A, | ||
magma_z_matrix | t, | ||
magma_z_matrix | b, | ||
magma_z_matrix | d, | ||
magma_z_matrix * | x, | ||
magma_queue_t | queue ) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d.
*(b-Ax) This kernel processes the thread blocks in reversed order.
[in] | maxiter | magma_int_t number of Jacobi iterations |
[in] | A | magma_z_matrix system matrix |
[in] | t | magma_z_matrix workspace |
[in] | b | magma_z_matrix RHS b |
[in] | d | magma_z_matrix vector with diagonal entries |
[out] | x | magma_z_matrix* iteration vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zjacobispmvupdateselect | ( | magma_int_t | maxiter, |
magma_int_t | num_updates, | ||
magma_index_t * | indices, | ||
magma_z_matrix | A, | ||
magma_z_matrix | t, | ||
magma_z_matrix | b, | ||
magma_z_matrix | d, | ||
magma_z_matrix | tmp, | ||
magma_z_matrix * | x, | ||
magma_queue_t | queue ) |
Updates the iteration vector x for the Jacobi iteration according to x=x+d.
*(b-Ax)
This kernel allows for overlapping domains: the indices-array contains the locations that are updated. Locations may be repeated to simulate overlapping domains.
[in] | maxiter | magma_int_t number of Jacobi iterations |
[in] | num_updates | magma_int_t number of updates - length of the indices array |
[in] | indices | magma_index_t* indices, which entries of x to update |
[in] | A | magma_z_matrix system matrix |
[in] | t | magma_z_matrix workspace |
[in] | b | magma_z_matrix RHS b |
[in] | d | magma_z_matrix vector with diagonal entries |
[in] | tmp | magma_z_matrix workspace |
[out] | x | magma_z_matrix* iteration vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zmergeblockkrylov | ( | magma_int_t | num_rows, |
magma_int_t | num_cols, | ||
magmaDoubleComplex_ptr | alpha, | ||
magmaDoubleComplex_ptr | p, | ||
magmaDoubleComplex_ptr | x, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
v = y / rho y = y / rho w = wt / psi z = z / psi
[in] | num_rows | magma_int_t dimension m |
[in] | num_cols | magma_int_t dimension n |
[in] | alpha | magmaDoubleComplex_ptr matrix containing all SKP |
[in] | p | magmaDoubleComplex_ptr search directions |
[in,out] | x | magmaDoubleComplex_ptr approximation vector |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbicgmerge1 | ( | magma_int_t | n, |
magmaDoubleComplex_ptr | skp, | ||
magmaDoubleComplex_ptr | v, | ||
magmaDoubleComplex_ptr | r, | ||
magmaDoubleComplex_ptr | p, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
p = beta*p p = p-omega*beta*v p = p+r
-> p = r + beta * ( p - omega * v )
[in] | n | int dimension n |
[in] | skp | magmaDoubleComplex_ptr set of scalar parameters |
[in] | v | magmaDoubleComplex_ptr input vector v |
[in] | r | magmaDoubleComplex_ptr input vector r |
[in,out] | p | magmaDoubleComplex_ptr input/output vector p |
[in] | queue | magma_queue_t queue to execute in. |
magma_int_t magma_zbicgmerge2 | ( | magma_int_t | n, |
magmaDoubleComplex_ptr | skp, | ||
magmaDoubleComplex_ptr | r, | ||
magmaDoubleComplex_ptr | v, | ||
magmaDoubleComplex_ptr | s, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
s=r s=s-alpha*v
-> s = r - alpha * v
[in] | n | int dimension n |
[in] | skp | magmaDoubleComplex_ptr set of scalar parameters |
[in] | r | magmaDoubleComplex_ptr input vector r |
[in] | v | magmaDoubleComplex_ptr input vector v |
[out] | s | magmaDoubleComplex_ptr output vector s |
[in] | queue | magma_queue_t queue to execute in. |
magma_int_t magma_zbicgmerge3 | ( | magma_int_t | n, |
magmaDoubleComplex_ptr | skp, | ||
magmaDoubleComplex_ptr | p, | ||
magmaDoubleComplex_ptr | s, | ||
magmaDoubleComplex_ptr | t, | ||
magmaDoubleComplex_ptr | x, | ||
magmaDoubleComplex_ptr | r, | ||
magma_queue_t | queue ) |
Mergels multiple operations into one kernel:
x=x+alpha*p x=x+omega*s r=s r=r-omega*t
-> x = x + alpha * p + omega * s -> r = s - omega * t
[in] | n | int dimension n |
[in] | skp | magmaDoubleComplex_ptr set of scalar parameters |
[in] | p | magmaDoubleComplex_ptr input p |
[in] | s | magmaDoubleComplex_ptr input s |
[in] | t | magmaDoubleComplex_ptr input t |
[in,out] | x | magmaDoubleComplex_ptr input/output x |
[in,out] | r | magmaDoubleComplex_ptr input/output r |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbicgmerge4 | ( | magma_int_t | type, |
magmaDoubleComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Performs some parameter operations for the BiCGSTAB with scalars on GPU.
[in] | type | int kernel type |
[in,out] | skp | magmaDoubleComplex_ptr vector with parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbicgmerge_spmv1 | ( | magma_z_matrix | A, |
magmaDoubleComplex_ptr | d1, | ||
magmaDoubleComplex_ptr | d2, | ||
magmaDoubleComplex_ptr | dp, | ||
magmaDoubleComplex_ptr | dr, | ||
magmaDoubleComplex_ptr | dv, | ||
magmaDoubleComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the first SpmV using CSR with the dot product and the computation of alpha.
[in] | A | magma_z_matrix system matrix |
[in] | d1 | magmaDoubleComplex_ptr temporary vector |
[in] | d2 | magmaDoubleComplex_ptr temporary vector |
[in] | dp | magmaDoubleComplex_ptr input vector p |
[in] | dr | magmaDoubleComplex_ptr input vector r |
[in] | dv | magmaDoubleComplex_ptr output vector v |
[in,out] | skp | magmaDoubleComplex_ptr array for parameters ( skp[0]=alpha ) |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbicgmerge_spmv2 | ( | magma_z_matrix | A, |
magmaDoubleComplex_ptr | d1, | ||
magmaDoubleComplex_ptr | d2, | ||
magmaDoubleComplex_ptr | ds, | ||
magmaDoubleComplex_ptr | dt, | ||
magmaDoubleComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the second SpmV using CSR with the dot product and the computation of omega.
[in] | A | magma_z_matrix input matrix |
[in] | d1 | magmaDoubleComplex_ptr temporary vector |
[in] | d2 | magmaDoubleComplex_ptr temporary vector |
[in] | ds | magmaDoubleComplex_ptr input vector s |
[in] | dt | magmaDoubleComplex_ptr output vector t |
[in,out] | skp | magmaDoubleComplex_ptr array for parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbicgmerge_xrbeta | ( | magma_int_t | n, |
magmaDoubleComplex_ptr | d1, | ||
magmaDoubleComplex_ptr | d2, | ||
magmaDoubleComplex_ptr | rr, | ||
magmaDoubleComplex_ptr | r, | ||
magmaDoubleComplex_ptr | p, | ||
magmaDoubleComplex_ptr | s, | ||
magmaDoubleComplex_ptr | t, | ||
magmaDoubleComplex_ptr | x, | ||
magmaDoubleComplex_ptr | skp, | ||
magma_queue_t | queue ) |
Merges the second SpmV using CSR with the dot product and the computation of omega.
[in] | n | int dimension n |
[in] | d1 | magmaDoubleComplex_ptr temporary vector |
[in] | d2 | magmaDoubleComplex_ptr temporary vector |
[in] | rr | magmaDoubleComplex_ptr input vector rr |
[in] | r | magmaDoubleComplex_ptr input/output vector r |
[in] | p | magmaDoubleComplex_ptr input vector p |
[in] | s | magmaDoubleComplex_ptr input vector s |
[in] | t | magmaDoubleComplex_ptr input vector t |
[out] | x | magmaDoubleComplex_ptr output vector x |
[in] | skp | magmaDoubleComplex_ptr array for parameters |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbcsrswp | ( | magma_int_t | n, |
magma_int_t | size_b, | ||
magma_int_t * | ipiv, | ||
magmaDoubleComplex_ptr | dx, | ||
magma_queue_t | queue ) |
magma_int_t magma_zbcsrtrsv | ( | magma_uplo_t | uplo, |
magma_int_t | r_blocks, | ||
magma_int_t | c_blocks, | ||
magma_int_t | size_b, | ||
magmaDoubleComplex_ptr | A, | ||
magma_index_t * | blockinfo, | ||
magmaDoubleComplex_ptr | x, | ||
magma_queue_t | queue ) |
For a Block-CSR ILU factorization, this routine performs the triangular solves.
[in] | uplo | magma_uplo_t upper/lower fill structure |
[in] | r_blocks | magma_int_t number of blocks in row |
[in] | c_blocks | magma_int_t number of blocks in column |
[in] | size_b | magma_int_t blocksize in BCSR |
[in] | A | magmaDoubleComplex_ptr upper/lower factor |
[in] | blockinfo | magma_int_t* array containing matrix information |
[in] | x | magmaDoubleComplex_ptr input/output vector x |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbcsrvalcpy | ( | magma_int_t | size_b, |
magma_int_t | num_blocks, | ||
magma_int_t | num_zblocks, | ||
magmaDoubleComplex_ptr * | Aval, | ||
magmaDoubleComplex_ptr * | Bval, | ||
magmaDoubleComplex_ptr * | Bval2, | ||
magma_queue_t | queue ) |
For a Block-CSR ILU factorization, this routine copies the filled blocks from the original matrix A and initializes the blocks that will later be filled in the factorization process with zeros.
[in] | size_b | magma_int_t blocksize in BCSR |
[in] | num_blocks | magma_int_t number of nonzero blocks |
[in] | num_zblocks | magma_int_t number of zero-blocks (will later be filled) |
[in] | Aval | magmaDoubleComplex_ptr * pointers to the nonzero blocks in A |
[in] | Bval | magmaDoubleComplex_ptr * pointers to the nonzero blocks in B |
[in] | Bval2 | magmaDoubleComplex_ptr * pointers to the zero blocks in B |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbcsrluegemm | ( | magma_int_t | size_b, |
magma_int_t | num_brows, | ||
magma_int_t | kblocks, | ||
magmaDoubleComplex_ptr * | dA, | ||
magmaDoubleComplex_ptr * | dB, | ||
magmaDoubleComplex_ptr * | dC, | ||
magma_queue_t | queue ) |
For a Block-CSR ILU factorization, this routine updates all blocks in the trailing matrix.
[in] | size_b | magma_int_t blocksize in BCSR |
[in] | num_brows | magma_int_t number of block rows |
[in] | kblocks | magma_int_t number of blocks in row |
[in] | dA | magmaDoubleComplex** input blocks of matrix A |
[in] | dB | magmaDoubleComplex** input blocks of matrix B |
[in] | dC | magmaDoubleComplex** output blocks of matrix C |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zbcsrlupivloc | ( | magma_int_t | size_b, |
magma_int_t | kblocks, | ||
magmaDoubleComplex_ptr * | dA, | ||
magma_int_t * | ipiv, | ||
magma_queue_t | queue ) |
magma_int_t magma_zbcsrblockinfo5 | ( | magma_int_t | lustep, |
magma_int_t | num_blocks, | ||
magma_int_t | c_blocks, | ||
magma_int_t | size_b, | ||
magma_index_t * | blockinfo, | ||
magmaDoubleComplex_ptr | val, | ||
magmaDoubleComplex_ptr * | AII, | ||
magma_queue_t | queue ) |
For a Block-CSR ILU factorization, this routine copies the filled blocks from the original matrix A and initializes the blocks that will later be filled in the factorization process with zeros.
[in] | lustep | magma_int_t lustep |
[in] | num_blocks | magma_int_t number of nonzero blocks |
[in] | c_blocks | magma_int_t number of column-blocks |
[in] | size_b | magma_int_t blocksize |
[in] | blockinfo | magma_int_t* block filled? location? |
[in] | val | magmaDoubleComplex* pointers to the nonzero blocks in A |
[in] | AII | magmaDoubleComplex** pointers to the respective nonzero blocks in B |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_zthrsholdselect | ( | magma_int_t | sampling, |
magma_int_t | total_size, | ||
magma_int_t | subset_size, | ||
magmaDoubleComplex * | val, | ||
double * | thrs, | ||
magma_queue_t | queue ) |
This routine selects a threshold separating the subset_size smallest magnitude elements from the rest.
Hilarious approach: Start a number of threads, each thread uses a pre-defined threshold, then checks for each element whether it is smaller than the threshold. In the end a global reduction identifies the threshold that is closest.
Assuming all values are in (0,1), the distinct thresholds are defined as:
threshold [ thread ] = thread / num_threads
We obviously need to launch many threads.
[in] | sampling | magma_int_t determines how many elements are considered (approximate method) |
[in] | total_size | magma_int_t size of array val |
[in] | subset_size | magma_int_t number of smallest elements to separate |
[in] | val | magmaDoubleComplex array containing the values |
[out] | thrs | float* computed threshold |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_vector_zlag2c | ( | magma_z_matrix | x, |
magma_c_vector * | y, | ||
magma_queue_t | queue ) |
convertes magma_z_matrix from Z to C
x | magma_z_matrix input vector descriptor | |
y | magma_c_vector* output vector descriptor | |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparse_matrix_zlag2c | ( | magma_z_matrix | A, |
magma_c_sparse_matrix * | B, | ||
magma_queue_t | queue ) |
convertes magma_z_matrix from Z to C
A | magma_z_matrix input matrix descriptor | |
B | magma_c_sparse_matrix* output matrix descriptor | |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_vector_clag2z | ( | magma_c_vector | x, |
magma_z_matrix * | y, | ||
magma_queue_t | queue ) |
convertes magma_c_vector from C to Z
[in] | x | magma_c_vector input vector descriptor |
[out] | y | magma_z_matrix* output vector descriptor |
[in] | queue | magma_queue_t Queue to execute in. |
magma_int_t magma_sparse_matrix_clag2z | ( | magma_c_sparse_matrix | A, |
magma_z_matrix * | B, | ||
magma_queue_t | queue ) |
convertes magma_c_sparse_matrix from C to Z
A | magma_c_sparse_matrix input matrix descriptor | |
B | magma_z_matrix* output matrix descriptor | |
[in] | queue | magma_queue_t Queue to execute in. |
void magmablas_zlag2c_sparse | ( | magma_int_t | M, |
magma_int_t | N, | ||
magmaDoubleComplex_const_ptr | dA, | ||
magma_int_t | lda, | ||
magmaFloatComplex_ptr | dSA, | ||
magma_int_t | ldsa, | ||
magma_queue_t | queue, | ||
magma_int_t * | info ) |
void magmablas_clag2z_sparse | ( | magma_int_t | M, |
magma_int_t | N, | ||
magmaFloatComplex_const_ptr | dSA, | ||
magma_int_t | ldsa, | ||
magmaDoubleComplex_ptr | dA, | ||
magma_int_t | lda, | ||
magma_queue_t | queue, | ||
magma_int_t * | info ) |
void magma_zlag2c_CSR_DENSE | ( | magma_z_matrix | A, |
magma_c_matrix * | B, | ||
magma_queue_t | queue ) |
void magma_zlag2c_CSR_DENSE_alloc | ( | magma_z_matrix | A, |
magma_c_matrix * | B, | ||
magma_queue_t | queue ) |
void magma_zlag2c_CSR_DENSE_convert | ( | magma_z_matrix | A, |
magma_c_matrix * | B, | ||
magma_queue_t | queue ) |
magma_int_t magma_zcgecsrmv_mixed_prec | ( | magma_trans_t | transA, |
magma_int_t | m, | ||
magma_int_t | n, | ||
magmaDoubleComplex | alpha, | ||
magmaDoubleComplex_ptr | ddiagval, | ||
magmaFloatComplex_ptr | doffdiagval, | ||
magmaIndex_ptr | drowptr, | ||
magmaIndex_ptr | dcolind, | ||
magmaDoubleComplex_ptr | dx, | ||
magmaDoubleComplex | beta, | ||
magmaDoubleComplex_ptr | dy, | ||
magma_queue_t | queue ) |
This routine computes y = alpha * A * x + beta * y on the GPU.
A is a matrix in mixed precision, i.e. the diagonal values are stored in high precision, the offdiagonal values in low precision. The input format is a CSR (val, row, col) in FloatComplex storing all offdiagonal elements and an array containing the diagonal values in DoubleComplex.
[in] | transA | magma_trans_t transposition parameter for A |
[in] | m | magma_int_t number of rows in A |
[in] | n | magma_int_t number of columns in A |
[in] | alpha | magmaDoubleComplex scalar multiplier |
[in] | ddiagval | magmaDoubleComplex_ptr array containing diagonal values of A in DoubleComplex |
[in] | doffdiagval | magmaFloatComplex_ptr array containing offdiag values of A in CSR |
[in] | drowptr | magmaIndex_ptr rowpointer of A in CSR |
[in] | dcolind | magmaIndex_ptr columnindices of A in CSR |
[in] | dx | magmaDoubleComplex_ptr input vector x |
[in] | beta | magmaDoubleComplex scalar multiplier |
[out] | dy | magmaDoubleComplex_ptr input/output vector y |
[in] | queue | magma_queue_t Queue to execute in. |