I am rewriting a chess engine I wrote to run on magic bitboards. I have the magic functions written and they take a square of the piece and an occupancy bitboard as parameters. What I am having debates with myself is which one of these board representation schemes is faster/more practical:
scheme 1: There is a bitboard for each type of piece, 1 for white knights, 1 for black rooks. . . , and in order to generate moves and push them to the move stack, I must serialize them to find the square of the piece and then call the magic function. Then I must serialize that move bitboard and push them. The advantage is is that the attacking and occupancy bitboards are closer at hand.
scheme 2: A simple piece centric array [2][16] or [32] contains the square indices of the pieces. A simply loopthrough and call of the functions is all it takes for the move bitboards. I then serialize those bitboards and push them to the move stack. I also have to maintain an occupancy bitboard. I guess getting an attack bitboard shouldn't be any different: I have to once again generate all the move bitboards and, instead of serializing them, I bitwise operate them in a mashup of magic.
I'm leaning towards scheme 2, but for some reason I think there is some sort of implementation similar to scheme 1 that is standard. For some reason I can't find drawbacks of making a "bitboard" engine without actually using bitboards. I wouldn't even be using bitboards for king and knight data, just a quick array lookup.
I guess my question is more of whether there is a better way to do this board representation, because I remember reading that keeping a bitboard for each type of piece is standard (maybe this is only necessary with rotated bitboards?). I'm relatively new to bitboard engines but I've read a lot and I've implemented the magic method. I certainly like the piece centric array approach - it makes a lot of arbitrary stuff like printing the board to the screen easier, but if there is a better/equal/more standard way can someone please point it out? Thanks in advance - I know this is a fairly specific question and difficult to answer unless you are very familiar with chess programming.
Last minute question: how is the speed of a lookup into a 2D array measure up to using a 1D array and adding 16 * team_side to the normal index to lookup the piece?
edit: I thought I should add that I am valuing speed over almost all else in my chess implementation. Why else would I go with magic bitboards as opposed to simply arrays with slide data?