0

I have a question about two cellular automata algorithms I've written in D.

I wrote this one first and was pretty disappointed with the speed. With a size of sizex=64 sizey=27 and an initial random fill of 40% running through takes a noticeable delay about 5 seconds on my fairly modern laptop.

(Position is a struct with two ints as x and y and a few methods for different calculations and DIR is an enum of Positions with the 8 tile directions) :

Position[] runCellAutoma(uint sizex,uint sizey,Position[] poslist){   
Position[] newposlist;
uint sy;
bool alive;
while(sy<=sizey){
    for(uint sx;sx<=sizex;++sx){
        int ncount;
        Position checkpos = Position(sx,sy);
        foreach_reverse(Position pos;poslist){               
            if(checkpos == pos){
                alive = true;
            }
            foreach(POS; [EnumMembers!DIR]){
                if(checkpos+POS == pos){
                    ++ncount;
                }
            }               
        }
        if(alive){
            if(ncount >= 2 && ncount <  4 && sx > 0 && sy>0 && sx<sizex && sy<sizey){
                 newposlist ~= checkpos;
            }
        }
        if(!alive){
            if(ncount >= 2 && ncount < 3&&sx>0&&sy>0&&sx<sizex&&sy<sizey){
                newposlist ~= checkpos;                                               
            } 
        }
    }
    ++sy;
}
return newposlist;

I read a bit more about cellular automata and it seemed like a lot of them worked with 2D arrays of bools. So I rewrote my algorithm like this:

Position[] runCellAutoma2(uint sizex,uint sizey,Position[] poslist){

Position[] newposlist;
bool[][] cell_grid = new bool[][](sizey,sizex);
for(int y;y<sizey;++y){
    for(int x;x<sizex;++x){           
        cell_grid[y][x] = false;            
    }
}
foreach(Position pos; poslist){
    if(pos.y < sizey && pos.x < sizex&&pos.y >=0 && pos.x >= 0){
        cell_grid[pos.y][pos.x] = true;
    }       
}
foreach(y,bool[] col; cell_grid){
    foreach(x,bool cell;cell_grid[y]){
        int ncount;           
        foreach(POS; [EnumMembers!DIR]){
            if(y+POS.y < sizey && x+POS.x < sizex&&y+POS.y >=0 && x+POS.x >= 0){                                    
               if(cell_grid[y+POS.y][x+POS.x]){
                    ++ncount;                                                       
                }                   
            }
        }
        if(cell){
            if(ncount >= 2 && ncount <  4 && x > 0 && y>0 && x<sizex && y<sizey){
                newposlist ~= Position(to!int(x),to!int(y));
            }
        }
        if(!cell){
            if(ncount >= 2 && ncount < 3&&x>0&&y>0&&x<sizex&&y<sizey){
                newposlist ~= Position(to!int(x),to!int(y));                                              
            } 
        }            
    }
}
return newposlist;
}

Both give me the same result but the second version works instantly with the same arguments as the first. I was wondering why the first one runs so much slower than the second one?

Thank you.

Jean-Baptiste Yunès
  • 34,548
  • 4
  • 48
  • 69
Grawprog
  • 1
  • 1
  • With your first data layout, you need to **search** for the relevant neighbors, with the second, you **look up** the relevant neighbors. It should be clear, why the second variant is much faster. Plus, with the second data layout, you have a smaller memory allocation. But that's a minor issue in your example. – cmaster - reinstate monica Dec 08 '17 at 06:38
  • Is it because I loop through the list of positions for every direction in the first one while the second one just checks the index for each direction? I'm sorry I guess that should have been obvious if that's what you mean. I just don't understand why there's so much of a noticeable difference. I can understand why the second one would be faster I just don't understand why it's so much faster. – Grawprog Dec 08 '17 at 06:50
  • You have `64*27 = 1728` places in your grid, over which you iterate in the outer two loops in variant 1. In each of these 1728 iterations, you are iterating over `40%*1728 = 691` `Position` objects. That's `1728*691 = 1194048` iterations total. In variant 2, I guess that your compiler is heavily optimizing the `foreach(POS; ...` loop, boiling it down to eight direct accesses to the relevant neighbors, so you only have `1728*8 = 13824` iterations. That's a speedup factor of 86x. Any questions? – cmaster - reinstate monica Dec 08 '17 at 07:06
  • No thank you that makes sense. I appreciate yoi taking the time to explain. – Grawprog Dec 08 '17 at 15:32

0 Answers0