2

I have a project in an embedded system (NodeMCU running MicroPython), where I need to store a very large array of variables, which have values of either 0 or 1. I need to be able to read/write them individually or via loops in a convenient way. For this example, I am filling the array with random integers between 0 and 1:

N = 50
table = [[randInt(0,1) for i in range(N)] for j in range(N)]

On my NodeMCU, even such a small array (2500 items) is enough to exceed the NodeMCU memory limits, crashing my script. I suppose this is because that in Python, int is an object with a lot of overhead. Since in my case I do not need the capacity of int variable - actually, 0 or 1 could be stored as a bit - how can I create and fill an array with the least-memory-consuming variables? Say, like in this example, randomizing between 0 and 1. I reviewed the uctypes, but as I'm new to Python, I couldn't get these to work. Or is there another way? How can create such an array with the least memory usage possible?

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172

2 Answers2

1

I would use a bytearray to store the data. You can store 8 bits per byte by calculating the index for the byte and the bit in that byte that corresponds to a particular row and column. Once you have the indexes you can use bit shifts and bitwise operators to set, clear or retrieve the value for that row and column.

class bitarray2d():
    """create 2d array of bits cols wide and rows high"""
    def __init__(self, cols, rows):
        self.bits = bytearray(cols * rows >> 3)
        self.cols = cols
        self.rows = rows

    def set(self, column, row):
        """set the bit that corresponds to the column and row given"""
        bit = (row * self.cols) + column
        self.bits[bit >> 3] |= 1 << 7 - (bit % 8)

    def clear(self, column, row):
        """clear the bit that corresponds to the column and row given"""
        bit = (row * self.cols) + column
        self.bits[bit >> 3] &= ~(1 << 7 - (bit % 8))

    def get(self, column, row):
        """get the value of the bit that corresponds to the column and row given"""
        bit = (row * self.cols) + column
        return 1 if self.bits[bit >> 3] & 1 << 7 - (bit % 8) else 0

    def print(self):
        """print the bitarray as a 2d array"""
        for row in range(self.rows):
            for column in range(self.cols):
                print(self.get(column, row), end="")
            print()

    def line(self, x0, y0, x1, y1):
        """draw a line starting at x0, y0 and ending at x1, y1"""
        steep = abs(y1 - y0) > abs(x1 - x0)
        if steep:
            x0, y0 = y0, x0
            x1, y1 = y1, x1
        if x0 > x1:
            x0, x1 = x1, x0
            y0, y1 = y1, y0
        dx = x1 - x0
        dy = abs(y1 - y0)
        err = dx >> 1
        ystep = 1 if y0 < y1 else -1
        while x0 <= x1:
            if steep:
                self.set(y0, x0)
            else:
                self.set(x0, y0)
            err -= dy
            if err < 0:
                y0 += ystep
                err += dx
            x0 += 1

# Test routine
columns = 64
rows = 64

bits = bitarray2d(columns, rows)
bits.line(0, 0, columns-1, rows-1)
bits.line(columns-1, 0, 0, rows-1)
bits.print()
Russ Hughes
  • 175
  • 7
0

You are correct, that int objects have a lot of overhead. However, in this case, these int objects may actually be cached (in CPython they would be), so the only overhead should be just the pointer...

However, a pointer will still require a machine word, to not rely on such things (implementation details) and to pack things more tightly, you could use actual arrays, you are currently using a list of list objects.

The arrays module provides object-oriented wrappers around primitive, C-like numeric arrays. Unfortunately, they do not provide multidimensional arrays.

So you could try something like:

import array
import random

N = 50

def build_table(N):
    rng = range(N)
    result = []
    for _ in rng:
        arr = array.array('B') #unsigned byte
        for _ in rng:
            arr.append(random.randint(0,1))
        result.append(arr)
    return result

table = build_table(N)

If this were CPython, I would suggest the bitarray module for maximum efficiency. I have no idea if that is available for micropython, but you could implement something like that yourself, likely on top of an array.array. There are many examples of this, it is sort of a classic data structure from the times when memory was measured in bytes. Here's just one example from the Python wiki.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • Thank you for your suggestion and this code example. Are you sure it is correct? I get a "name 'B' isn't defined" error. – Justinas Rubinovas Mar 31 '20 at 21:23
  • Okay, that works now, but I still don't get it... Shouldn't it produce a 2D array? When I print the "table", console prints: [array('B', [0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1])] – Justinas Rubinovas Mar 31 '20 at 21:58
  • Oh, right! I am still getting the hang of Python indentations, didn't notice that one. Alright, this is now much better. But still, I am using a byte to store just 0 or 1. Is it possible to use a bit instead? I can't find bit as a datatype that array() could take as an argument. – Justinas Rubinovas Mar 31 '20 at 22:54
  • @JustinasRubinovas no, as I stated, there is a [bitarray library](https://pypi.org/project/bitarray/) available available for CPython, but I don't know if it will work with micropython, I doubt it though. You could implement your own on top of an `array.array`, though. depending on the complexity o you task, this is relatively achievable, and there's probably lots of guides out there since this is sort of a classic data structure. EDIT: The [python wiki](https://wiki.python.org/moin/BitArrays) even has a basic implementation – juanpa.arrivillaga Mar 31 '20 at 23:05
  • If you're going to post an answer, you should know that the answer will work on Micropython. – Patrick Apr 07 '20 at 17:30
  • @Patrick are you asying `array` isn't available in Micropython? – juanpa.arrivillaga Apr 07 '20 at 17:39
  • Hi @juanpa.arrivillaga in your answer you say you don't know if this will work in Micropython. A good answer would not be "maybe this will work" it would not leave that open to the reader to figure out. As is, your answer is more a comment than an answer. – Patrick Apr 07 '20 at 18:29
  • @Patrick the answer I posted in the question *does* work in Micropython, and does decrease the memory requirements. I said if the OP wants to go even further, they'd need to us a bitarray. – juanpa.arrivillaga Apr 07 '20 at 18:31
  • I suggest deleting the part at the end then – Patrick Apr 07 '20 at 20:15