efficiently listing distinctive permutations that meet 'adjecancy requirements'

Question

I'm trying to store distinctive permutations that meet additional (adjecancy) requirements. Below is a figure to illustrate the problem I am trying to solve.

I'm trying to make code that lists all distinctive permutations while taking adjecancy requirements into account (rule 3 in the figure). Two figures that show an acceptable and an unacceptable solution:

I first decided to model this problem as an array or string, where each index corresponds with a tile on the map (see figure on the bottom for the mapping). e.g. for the 7 blue tiles example:

['B1','R','B2','R','G','G','G','R','B2','R','B2','R','B1','R','B3','R','B3','R']

I then considered listing all distinct permutations, and filtering out infeasible solutions afterwards. For this I tried 'distinct_permutations()' from the library more_itertools. However, (unsurprisingly) this results in 183.783.600 candidate solutions that need to be checked for the given example.

from more_itertools import distinct_permutations 
solutions =
list(distinct_permutations(['T', 'R', 'T', 'U', 'R', 'T', 'I', 'T',
'O', 'T', 'U', 'R', 'T', 'I', 'T', 'O', 'T', 'U']

I was wondering whether it would be possible to code this in a more efficient manner; where adjecancy requirements are taken into account to directly exclude all, or at least a 'large proportion' of the infeasible solutions (instead of filtering them out afterwards).

One idea that I had was to use 'pairs' of tiles (e.g. 'B1+R'). Treating such 'pairs' as if they were single tiles would reduce the number of 'elements' considered in the permutations. However, while this might be helpfull on the edges, the centre of the map would mess things up.

Could anyone help me with finding an approach or solution to code this problem more efficiently? A nudge in the right direction would be greatly appreciated!

Start placing blues first, then their red neighbors. This way you elliminate all those failures about blue+2_red_neighbors cases. — huseyin tugrul buyukisik, May 11 '22 at 11:34
Thank you for your reply. That indeed seems like a good idea. Do you have an idea how to code it efficiently though? The best I could come up with, would be a nested for loop that iteratively 'fills in' more tiles. Given that my desired output is a list of strings (or nested list), I don't know if a nested for loop would be the most efficient direction... Anyone any insights on that? — bprsch, May 11 '22 at 14:13
What about numbers? It has 18 tiles. If there are only 3 colors, this makes 2 bits of data per tile. 18 tiles X 2 bits = 36 bits. A long-integer (64bit) variable could hold a whole instance. Perhaps some math or bitwise operations can give speed? For example, to check if a blue tile on least-significant 2 bits has two neighbors, you could check ```mask <-- AND operator --> value```? Blues can have 18 different positions so you would need only 36 mask variables to check. — huseyin tugrul buyukisik, May 11 '22 at 14:28

huseyin tugrul buyukisik · Answer 1 · 2022-05-11T16:51:12.390

If you start generating from blue colors first then place reds next, you don't have to check for collisions and the number of permutations required is decreased. But if you need more performance than that, you can change to another data type. Perhaps 2 bits per tile, 4 tiles per byte, 32 tiles per 64bit integer.

If you don't want to optimize it by generating blues first, then finding collisions on specific indices can be simulated by masked collisions using integers:

separate arrays into three
blue on/off array --> 32 bit integer for whole tile pattern
red on/off array --> 32 bit integer for whole tile pattern
green on/off array --> 32 bit integer for whole tile pattern

start

 blue channel:                    0 0 0 0 1 0 0 0 0 (1 blue tile)
                                            
 mask to count red neighbors:     0 0 0 1 0 1 0 0 0 (2 red tiles)
                                         AND
 get red channel                  1 1 1 0 0 1 0 0 0 (only 1 collides)
                                         =
 result                           0 0 0 0 0 1 0 0 0
                                   int.bit_count()   
 bit count =  1 ==> bad

start (check collisions for 2 blues vs 4 reds at once)

 blue channel               :        0 0 0 0 1 0 0 1 0 (2 blues)
                                            
 mask to count red neighbors:        0 0 0 1 0 1 1 0 1 (4 reds)
                                           AND
 get red channel                     1 1 1 1 0 1 1 0 1 (4 collides)
                                            =
 result                              0 0 0 1 0 1 1 0 1
                                      int.bit_count()   
 bit count =  4 ==> acceptable

So, if 000010010 means 17, then mask[17] from array of masks can be applied on red channel to test the condition.

Something like this:

mask   = masks[blue_channel]
result = (mask and red_channel)
count  = result.bit_count()
count2 = count_masks[blue_channel]
if count == count2:
    print("ok")

Since there are 18 tiles, each constant mask array can have at most 2^18 elements (less than a million). When permutation generator is generating the tile patterns in sorted order, the mask-array access pattern can be cache-friendly.

If this causes too high cache-miss, then each mask can be computed at the cost of few CPU cycles:

blue:                0 0 0 0 1 0 0 0
compute left shift:  0 0 0 1 0 0 0 0
compute right shift: 0 0 0 0 0 1 0 0 
OR the two shifts:   0 0 0 1 0 1 0 0 --> collision mask

blue:                0 0 1 0 1 0 0 0
compute left shift:  0 1 0 1 0 0 0 0
compute right shift: 0 0 0 1 0 1 0 0 
OR the two shifts:   0 1 0 1 0 1 0 0 --> collision mask

but this would work only for the surface tiles. The interior tiles can have 3 neighbors like this:

blue:                0 0 0 0 1 0 0 0 
hash1                1 0 0 0 0 0 0 0
hash2                0 0 1 0 0 0 0 0 
hash3                0 0 0 0 0 1 0 0 
OR                   1 0 1 0 0 1 0 0 --> collision mask

Finding a hash function can be tricky. Perhaps only interior tiles could access a mask array while surface tiles are only computed using bitwise operations. Perhaps, if you reorder tiles in bit positions, you can come up with a pattern that can work with shifting for interior tiles too.

For example, according to your mapping,

tile 12 checks: 11, 13, 17.
- 11: left shift by 1 bit
- 13: right shift by 1 bit
- 17: right shift by 5 bits
tile 14 checks: 13,15, 3.
- 13: left shift 1
- 15: right shift 1
- 3: left shift by 11 --> not similar to tile 12's operation

it doesn't work immediately. It requires some extra mapping. You can just access a bit-pattern array of 2^6 number of elements for interior vs surface and 2^9 elements for surface vs interior. This fits inside L1 cache for faster testing of third bit pattern.

If problem always asks 6 blue tiles, then this masked-collision method computes 6 of them at once using 4-5 bit-wise operations and 6 array accesses. String version would require 12 array accesses. So it should give performance by accessing memory less.

Simpler approach:

tile map looks like 3 hexagons collided
1 byte per hexagon
simple AND between center and surface x3 (and opposite)
simple AND with self x 3

Thank you very much for your answer. It is a bit technical for me (my background is industrial engineering rather than e.g. computer science.). I'll take some time to really dig into it. From a glance from your second approach: do you mean to use a binary 'format/object' with 8 bits to 'express' all different ways 'to turn blue tiles on or off'? From a practical point of view (I'm most comfortable with Python): could I use e.g. a 8*n numpy array for this. (If i understand correctly I can use multiple of such arrays for both blue and red to 'easily identify collisions') — bprsch, May 13 '22 at 14:03
In both versions its basically same thing with different patterns. Bit level parallelism is sometimes useful to check multiple things at once but please first benchmark the first string version that you place blue first maybe it makes fast enough already so bitwise hack may not be required at all. In last part I meant 8bit integer for each color or char as its enough to hold a hexagon. Did you benchmark string version where you start with blue first? — huseyin tugrul buyukisik, May 13 '22 at 15:00

efficiently listing distinctive permutations that meet 'adjecancy requirements'

1 Answers1