spmat
spmat
library¶
spmat
library helps the user to construct and manipulate distributed sparse matrices. It uses a distributed version of the CSR format, in which the rows are distributed across all the PEs in a cyclic way.
Suppose we have the following 4x4 matrix:
If there are 2PEs, since spmat
does a cyclic distribution, PE0 owns ROW0 and ROW2, PE1 owns ROW1 and ROW3. Note that the ROW numbers are global and it is convenient to use a local ROW number to access a local portion of the array on each PE. Specifically, a local row number can be computed by doing GLOBAL_ROW / nPEs
. Also, the owner of a global row can be identified by computing GLOBAL_ROW % nPEs
.
// 2PEs
GLOBAL ROW 0, PE0's ROW0: 1 0 1 0
GLOBAL ROW 1, PE1's ROW0: 0 1 0 1
GLOBAL ROW 2, PE0's ROW1: 1 0 1 0
GLOBAL ROW 3, PE1's ROW1: 0 0 1 1
A typical idiom for iterating over non-zeros in a specific local row i
of a sparse matrix A
is as follows:
for (int64_t j = A->loffset[i]; j < A->loffset[i+1]; j++) {
// visit each non-zero
int64_t nonzero = A->lnonzero[j];
}
Also, a typical idiom for iterating over remote non-zeros in a global row is as follows:
int64_t global_row = 1;
int64_t pe = global_row % shmem_n_pes(); // find the owner of "global_row"
int64_t i = global_row / shmem_n_pes(); // compute the local row number
int64_t start = shmem_int64_g(&A->loffset[i], pe);
int64_t end = shmem_int64_g(&A->loffset[i+1], pe);
for(int j = start; j < end; j++) {
int64_t nonzero = shmem_int64_g(&A->lnonzero[j], pe);
}
It is worth noting that, unlike the typical CSR format, nonzero
stores a global column number (0, 1, 2, 3 in our example, see below), which reduces memory size (c.f. the CSR format uses three arrays). For algorithms in which actual values need to be stored, bale3's spmat
allows the user to use A->value[j]
. The type of value[]
is double.
1 0 1 0 our representation 0 - 2 -
0 1 0 1 -----------------> - 1 - 3
1 0 1 0 -----------------> 0 - 2 -
0 0 1 1 - - 2 3
Now let us use a working example that reads the following matrix market format file, which is equivalent to the matrix above, load it into a sparsemat_t
object, and finally prints the contents of it on each PE:
Here is a OpenSHMEM/libget program that iterates over non-zeros on each PE and print it. Notice that the code uses all-to-all barrier (shmem_barrier_all()
or lgp_barrier()
) so PE0 first prints the contents, then PE1 does the printing:
Example output on 2PEs:
[PE0] localrow:0, lnonzero[0]: 0
[PE0] localrow:0, lnonzero[1]: 2
[PE0] localrow:1, lnonzero[2]: 0
[PE0] localrow:1, lnonzero[3]: 2
[PE1] localrow:0, lnonzero[0]: 1
[PE1] localrow:0, lnonzero[1]: 3
[PE1] localrow:1, lnonzero[2]: 2
[PE1] localrow:1, lnonzero[3]: 3
[PE0] non-zero at column 1 in PE1's local row 0
[PE0] non-zero at column 3 in PE1's local row 0
You can also use print_matrix(A)
from spmat
library:
Further Readings¶
- README.md for spmat https://github.com/jdevinney/bale/blob/master/src/bale_classic/spmat/README.md