Fortran version of python loop giving incorrect answer

Question

I've been working on an international trade model and the model has gotten really slow (sometimes taking weeks at a time to finish). Mostly, there was a big for loop that was slowing the process down, as shown here (in Python):

for s in xrange(self.S-1):
                c_matrix[:,s+1,s+1:self.T+s+1] = ((self.beta * (1-self.MortalityRates[:,s,s:self.T+s]) * (1 + r_path[s+1:self.T+s+1] - self.delta)\
                                                * psi[:,s+1,s+1:self.T+s+1])/psi[:,s,s:self.T+s])**(1/self.sigma) * c_matrix[:,s,s:self.T+s]*np.exp(-self.g_A)

                a_matrix[:,s+1,s+1:self.T+s+1] = (  (we[:,s,s:self.T+s] + (1 + r_path[s:self.T+s] - self.delta)*a_matrix[:,s,s:self.T+s] + bqvec_path[:,s,s:self.T+s])\
                                                -c_matrix[:,s,s:self.T+s]*(1+we[:,s,s:self.T+s]*(self.chi/we[:,s,s:self.T+s])**self.rho)  )*np.exp(-self.g_A)

I won't go into too much detail about each piece, but I decided to make a Fortran module that, in theory, should do the exact same thing but should run significantly faster. I'm using Cython and distutils to wrap the Fortran module. Here is the .f90 file (focusing specifically on the a_matrix part of the for loop above) for your reference:

MODULE FILLA

USE ISO_C_BINDING, ONLY: C_DOUBLE, C_INT

IMPLICIT NONE

CONTAINS

SUBROUTINE      

FILLASSETS(A_MATRIX,WE,R_PATH,BQ_VECPATH,C_MATRIX,I,S,T,DELTA,CHI,RHO,G_A) BIND(C)

  INTEGER(C_INT), INTENT(IN) :: I,S,T
  INTEGER :: X, E, L
  REAL(C_DOUBLE), INTENT(IN) :: DELTA, CHI, RHO, G_A
  REAL(C_DOUBLE), DIMENSION(0:I-1,0:S-1,0:T+S-1),INTENT(IN) :: C_MATRIX, WE, BQ_VECPATH
  REAL(C_DOUBLE), DIMENSION(0:T+S-1), INTENT(IN) :: R_PATH
  REAL(C_DOUBLE), DIMENSION(0:I-1,0:S,0:T+S-1), INTENT(INOUT) :: A_MATRIX
  REAL(C_DOUBLE), DIMENSION(0:I-1,0:S-1,0:T+S-1):: R_PATHM

  DO E = 0, I-1
    DO L = 0, S-1
        R_PATHM(E,L,:)=R_PATH(:)
    ENDDO
  ENDDO

  DO X = 0, S-2
        A_MATRIX(:,X+1,X+1:T+X+1) = ((WE(:,X,X:T+X)+(1.0D0+R_PATHM(:,X,X:T+X)-DELTA)&
                                    &*A_MATRIX(:,X,X:T+X)+BQ_VECPATH(:,X,X:T+X))&
                                    &-C_MATRIX(:,X,X:T+X)*(1.0D0+WE(:,X,X:T+X)*(CHI/WE(:,X,X:T+X))&
                                    &**RHO))*EXP(-G_A)
  ENDDO
END SUBROUTINE FILLASSETS

END MODULE

And this .pyx file is how the arrays from python get passed into Fortran:

from numpy cimport ndarray as ar
cdef extern from "filla.h":
        void fillassets(double* A_Matrix, double* we, double* r_path, double* bq_vecpath, double* c_matrix, int* I, int* S, int* T, double* delta, double* chi, double* rho, double* g_A)

cpdef f_filla(ar[double, ndim=3] A_Mat, ar[double, ndim=3] w_e, ar[double, ndim=1] r_path, ar[double, ndim=3] bq_vecpath, ar[double,ndim=3] c_mat, double delta, double chi, double rho, double g_A):
        cdef int I,S,T,Q
        if A_Mat.flags['C_CONTIGUOUS'] and w_e.flags['C_CONTIGUOUS'] and r_path.flags['C_CONTIGUOUS'] and bq_vecpath.flags['C_CONTIGUOUS'] and c_mat.flags['C_CONTIGUOUS']:
                I=c_mat.shape[0]
                S=c_mat.shape[1]
                Q=c_mat.shape[2]
                T=Q-S
                fillassets(&A_Mat[0,0,0],&w_e[0,0,0],&r_path[0],&bq_vecpath[0,0,0],&c_mat[0,0,0],&I,&S,&T,&delta,&chi,&rho,&g_A)
        else:
                raise ValueError("Input array U is not C-contiguous")

So I've set it up without any errors, but when I run them side by side, the Fortran module returns incorrect values in the asset matrix, for some reason. I think it's something relating to how Fortran and Python handle arrays, but I haven't found a straightforward explanation and hope that I could find some help here. I'm very unfamiliar with Fortran and so it's probably something simple that I'm missing. I tried to keep this as brief as possible, so if there are details I've left out, let me know.

EDIT: I'm using Mac OS X El Capitan and the gfortran complier.

Fortran is column major, Python et al row major. Perhaps that's the problem. — karakfa, Jan 30 '16 at 20:13
Always use tag [tag:fortran] for Fortran questions. Add a version i necessary to distinguish. Moreover, your code isn't Fortran 90, but Fortran 2003. — Vladimir F Героям слава, Jan 30 '16 at 20:22
Note you can use Fortran-ordered arrays in Python without any problem http://stackoverflow.com/questions/7688304/how-to-force-numpy-array-order-to-fortran-style — Vladimir F Героям слава, Jan 30 '16 at 20:24
However nice is Fortran for this sort of thing, at this level of complexity you're really better off just writing the code in cython using typed memoryviews. — ev-br, Feb 02 '16 at 20:54

score 1 · Answer 1 · answered Jan 30 '16 at 20:10

Fortran uses the opposite layout of array elements: if you have an array of dimensions (m,n) then elements would be laid out column by column, that is, the second dimension varies slowest. C (and maybe python?) do it the other way around, laying out row after row, so the first dimension varies slowest. Thus, if you have a C matrix of size (m,n) and pass it to Fortran, you'd need to tell it the dimensions are (n,m).

Fortran version of python loop giving incorrect answer

1 Answers1