1

Recently I came across a twitch streamer who was working on his own Deep Learning based chess engine. I was going through the code I saw in the video and one thing I didn't quite understand was why he used logical shifts when he was preparing the input data (i.e. the chess board representations) for training. Here are the rough steps that he followed:

  1. He fetched a dataset with chess games in "pgn" format
  2. For each move in each game there is a new board state that occurs. Each of these new states is serialized in the following way:
  • He creates a 8x8 matrix that represents the 8x8 board after this specific move
  • The matrix is supposed to store 8 bit unsigned integers
  • He places all chess pieces on the board (i.e. in the matrix)
  • The white pieces are defined as follows: {"P": 1, "N": 2, "B": 3, "R": 4, "Q": 5, "K": 6}
  • The black pieces are defined as: {"p": 9, "n": 10, "b": 11, "r": 12, "q": 13, "k": 14}
  • This means for instance that white pawns are stored as "1" in the matrix, whereas black queen will be stored as "13"
  1. After serializing the board he generates the final board state from the original 8x8 matrix by executing some logical bit operations that I don't quite understand. Also the newly generated (i.e. final board state) is not 8x8 but 5x8x8:

 # Init new board state
 final_boardstate = np.zeros((5, 8, 8), np.uint8)

 # old_boardstate is the initial 8x8 matrix containing uint8 values
   
 # Bit operations that I don't understant
 final_boardstate[0] = (old_boardstate>> 3) & 1
 final_boardstate[1] = (old_boardstate>> 2) & 1
 final_boardstate[2] = (old_boardstate >> 1) & 1
 final_boardstate[3] = (old_boardstate >> 0) & 1 

I was wondering can anyone help me understand some of the logic behind these operations? As far as I understand, he wants to create 5 different 8x8 board representations each based on different logical shift (3,2,1 and 0 bit logical left shift). However, I am not completely sure that this assumption is correct and I don't really know what is the reasoning behind running these operations in the context of chess board representations.

cru3lgenius
  • 106
  • 1
  • 7

1 Answers1

3

These are the pieces in binary: P: 0001 N: 0010 B: 0011 R: 0100 Q: 0101 K: 0110 p: 1001 n: 1010 b: 1011 r: 1100 q: 1101 k: 1110

You can see that the left bit of all black pieces is always one and the left bit of the white pieces is always zero. That's why 7 and 8 have been skipped. With

(old_boardstate>> 3) & 1

The color indicating bit is shifted all the way to the right. The & 1 removes everything else that is not the wanted bit. So this expression returns a 1 if the color of the piece is Black, otherwise it returns a 0. The three other bits indicate the piece type independent of the color. The bit operations that you don't understand are used to get the individual bits out of the 8-bit integer to store them in the numpy array. The numpy array is the input for the neural network and has the 5x8x8 dimensions because five input neurons are used to represent each field of the board.

nikohass
  • 46
  • 3
  • I still don't understand why the bits are shifted in this order 0, 1, 2, 3. Why these numbers? – user134 Mar 20 '23 at 21:51
  • 1
    You shift from 0 to 3 because you want to read out each of the 4 bits. You want to get the bit at index 0, 1, 2, and 3. 1101 >> 0 & 1 = 1 for the bit at index 0 and 1101 >> 1 & 1 = 0 for the bit at index 1. It doesn't really matter in which order you store the bits in the array, as long as it always stays in the same order. It makes no difference for a neural network because it will learn the encoding regardless of the ordering of the input neurons. – nikohass Mar 21 '23 at 22:52
  • Thanks, @nikohass. Now it's more clear what these operations do, however, I still don't get what context the bits bring to the neural network. I guess the answer here is more complicated and it can't be expressed in a sentence or two, but still any tip will be helpful. – user134 Mar 22 '23 at 09:30