4

Suppose I have a 2D array (8x8) of 0's. I would like to fill this array with a predetermined number of 1's, but in a random manner. For example, suppose I want to place exactly 16 1's in the grid at random, resulting in something like this:

[[0, 0, 0, 1, 0, 0, 1, 0],
 [1, 0, 0, 0, 0, 0, 0, 1],
 [0, 0, 1, 1, 1, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0],
 [0, 1, 0, 0, 0, 0, 0, 0],
 [0, 0, 1, 0, 1, 1, 0, 0],
 [0, 1, 0, 0, 0, 1, 0, 0],
 [0, 1, 1, 0, 0, 0, 0, 1]]

The resulting placement of the 1's does not matter in the slightest, as long as it is random (or as random as Python will allow).

My code technically works, but I imagine it's horrendously inefficient. All I'm doing is setting the probability of each number becoming a 1 to n/s, where n is the number of desired 1's and s is the size of the grid (i.e. number of elements), and then I check to see if the correct number of 1's was added. Here's the code (Python 2.7):

length = 8
numOnes = 16
while True:
    board = [[(random.random() < float(numOnes)/(length**2))*1 for x in xrange(length)] for x in xrange(length)]
    if sum([subarr.count(1) for subarr in board]) == 16:
        break
print board

While this works, it seems like a roundabout method. Is there a better (i.e. more efficient) way of doing this? I foresee running this code many times (hundreds of thousands if not millions), so speed is a concern.

dpwilson
  • 997
  • 9
  • 19

2 Answers2

10

Either shuffle a list of 16 1s and 48 0s:

board = [1]*16 + 48*[0]
random.shuffle(board)
board = [board[i:i+8] for i in xrange(0, 64, 8)]

or fill the board with 0s and pick a random sample of 16 positions to put 1s in:

board = [[0]*8 for i in xrange(8)]
for pos in random.sample(xrange(64), 16):
    board[pos//8][pos%8] = 1
user2357112
  • 260,549
  • 28
  • 431
  • 505
  • I was not aware of the existence of shuffle; thanks for pointing that out. Looking at your second example, that seems pretty obvious now. Thanks for your help! – dpwilson Nov 19 '15 at 18:37
  • For fun, I just tried timing these algorithms, and the second one is quite a bit faster that the first (31 µs vs 53 µs on my machine), and about the same as my `numpy` method (27 µs). For large boards (1 million squares), it's *much* faster than either (6 ms, vs 857 ms for method 1, and 142 ms for my method). – Matt Hall Nov 19 '15 at 18:43
  • There is a caveat with the second method: if I'm not mistaken it is not guaranteed that 16 *different* positions are selected. Perhaps that's why it is so fast. – Erwin411 Feb 16 '19 at 17:28
  • @Erwin411: It is definitely guaranteed to pick 16 different positions. `random.sample` will not return duplicates. See the [docs](https://docs.python.org/3/library/random.html#random.sample). It most likely outperformed the `shuffle` so drastically on the million-square board because shuffling a million elements is a lot more work than picking 16. – user2357112 Feb 16 '19 at 18:51
  • @user2357112: Thanks for the clarification, I should have read the docs. I was assuming that `sample` returns independent samples from some (uniform) distribution. – Erwin411 Feb 17 '19 at 22:15
2

I made the ones, made the zeros, concatenated them, shuffle them, and reshaped.

import numpy as np
def make_board(shape, ones):
    o = np.ones(ones, dtype=np.int)
    z = np.zeros(np.product(shape) - ones, dtype=np.int)
    board = np.concatenate([o, z])
    np.random.shuffle(board)
    return board.reshape(shape)

make_board((8,8), 16)

Edit.

For what it's worth, user2357112's approach with numpy is fast...

def make_board(shape, ones):
    size = np.product(shape)
    board = np.zeros(size, dtype=np.int)
    i = np.random.choice(np.arange(size), ones)
    board[i] = 1
    return board.reshape(shape)
Community
  • 1
  • 1
Matt Hall
  • 7,614
  • 1
  • 23
  • 36
  • 1
    Thanks for providing another alternative! I always like seeing the various ways people go about solving these things. – dpwilson Nov 19 '15 at 19:36
  • Beware of using `np.random.choice` for this; with `replace=True` (the default), it is not guaranteed to pick 16 different cells, and with `replace=False`, it is really slow due to a poor implementation decision that they're stuck with for backward compatibility (it performs a complete shuffle instead of shuffling just enough to produce the sample). – user2357112 Feb 16 '19 at 18:53