I am looking for the fastest way to solve the following problem:
Given some volume of lattice points in a 3D grid, some points b_i
(the boundary) satisfy f(b_i)=0
, while another point a_0
satisfies f(a_0)= 1
.
All other points (non-boundary) are some linear combination of the surrounding 26 points. For example, I could want
f(i, j, k)= .05*f(i+1, j, k)+.1*f(i, j+1, k)+.07*f(i, j, k+1)+...
The sum of the coefficients .05+.1+.07+...
will add up to 1
. My objective is to solve for f(x_i)
for all x_i
in the volume.
Currently, I am using the successive over-relaxation (SOR) method, which basically initializes the boundary of the volume, assigns to each point the weighted average of the 26 surrounding points, and repeats. The SOR method just takes a combination of f(x_i)
after the most recent iteration and f(x_i)
after an iteration before.
I was wondering if anyone knows of any faster ways to solve this problem for a 3D grid around the size 102x102x48. SOR currently takes about 500-1000 iterations to converge to the level I want (varying depending on the coefficients used). I am most willing to use matlab, idl, and c++. Does anyone have any idea of how fast SOR is compared to converting the problem into a linear system and using matrix methods (like BCGSTAB)? Also, which method would be most effectively (and easily) parallelized? I have access to a 250 core cluster, and have been trying to make the SOR method parallel using mpi and c++, but haven't seen as much speed increase as I would like (ideal would be on the order of 100-fold). I would greatly appreciate any ideas on speeding up the solution to this problem. Thanks.