1

I want to perform matrix multiplication over the GF(2) field. (in other words, '+' maps to XOR, and '×' maps to AND)

I usually do matrix multiplication in the Real field followed by a mod(2) operation.

A simple python example:

import numpy as np

np.random.seed(0)

A = np.random.randint(0, 2, size=(100, 200)) # binary matrix
B = np.random.randint(0, 2, size=(200, 300)) # binary matrix

C = np.mod(np.matmul(A, B), 2) # binary matrix

I have tried TensorFlow as well. It seems to be faster than NumPy for me.

When we consider large binary matrices, is there a faster way for this?

Maybe a python or c/c++ library is required.

Any help is highly appreciated.

Thanks

  • You'll need some type of library with a [carryless multiply](https://en.wikipedia.org/wiki/Carry-less_product). Native python won't be fast. A carryless multiply such as PCLMULQDQ on X86 is fast, but you'd need assembly or C | C++ intrinsics to use it. PCLMULQDQ multiplies two 64 bit operands to produce a 128 bit product (bit 127 will be zero). You'd need to use extended precision math for larger operands. – rcgldr Jul 07 '22 at 11:29

1 Answers1

0

There's a good chance that you can use Numba to your advantage here; see the answers to this question for various useful tricks. In particular, I can shave off some 85% of the total time by doing the following:

import numba
import numpy


@numba.njit(fastmath=True, cache=True, parallel=True)
def matmul_f2(m1, m2):
    mr = np.empty((m1.shape[0], m2.shape[1]), dtype=np.bool_)
    for i in numba.prange(mr.shape[0]):
        for j in range(mr.shape[1]):
            acc = False
            for k in range(m2.shape[0]):
                acc ^= m1[i, k] & m2[k, j]
            mr[i, j] = acc
    return mr
fuglede
  • 17,388
  • 2
  • 54
  • 99