0

I have been tried to do the Morris Pratt table and the code is basically this one in C:

void preMp(char *x, int m, int mpNext[]) {
int i, j;

i = 0;
j = mpNext[0] = -1;
while (i < m) {
   while (j > -1 && x[i] != x[j])
      j = mpNext[j];
   mpNext[++i] = ++j;
 }
}

and here is where i get so far in Fortran

program MP_ALGORITHM
implicit none
integer, parameter :: m=4
character(LEN=m) :: x='abac'
integer, dimension(4) :: T
integer :: i, j

i=0
T(1)=-1
j=-1

do while(i < m)
  do while((j > -1) .AND. (x(i+1:i+1) /= (x(j+i+1:j+i+1))))
    j=T(j)
  end do
  i=i+1
  j=j+1
  T(i)=j
end do
print *, T(1:)
end  program MP_ALGORITHM

and the problem is i think i am having the wrong output.

for x=abac it should be (?):

a   b  a  c

-1  0  1  0

and my code is returning 0 1 1 1

so, what i've done wrong?

wdarking
  • 159
  • 9
  • The question is whether the Morris-Pratt table should look the same in Fortran and C, because it is essentially an array of indces, so instead of the zero-based `-1, 0, 1, 0` the one-based table in Fortran might be `0, 1, 2, 1`. – M Oehm Apr 04 '16 at 07:07

2 Answers2

1

this site isn't really a debugging site. Normally I would suggest you have a look at how to debug code. It didn't take me very long to go through your code with a pen and paper and verify that that is indeed the table it produces.

Still, here are a few pointers:

  • The C code compares x[i] and x[j], but you compare x[i] and x[i+j] in your Fortran code, more or less.

  • Integer arrays usually also start at index 1 in Fortran. So just like adding one to the index in the x String, you also need to add 1 every time you access T anywhere.

chw21
  • 7,970
  • 1
  • 16
  • 31
1

The problem here is that C indices start from zero, but Fortran indices start from one. You can try to adjust the index for every array acces by one, but this will get unwieldy.

The Morris-Pratt table itself is an array of indices, so it should look different in C and Fortran: The Fortran array should have one-based indices and it should use zero as invalid index.

Together with the error that chw21 pointed out, your function might look like this:

subroutine kmp_table(x, t)
    implicit none

    character(*), intent(in) :: x
    integer, dimension(:), intent(out) :: t

    integer m
    integer :: i, j

    m = len(x)

    i = 1
    t(1) = 0
    j = 0

    do while (i < m)
        do while(j > 0 .and. x(i:i) /= x(j:j))
            j = t(j)
        end do
        i = i + 1
        j = j + 1
        t(i) = j
    end do
end subroutine 

You can then use it in the Morris-Pratt algorithm as taken straight from the Wikipedia page with adjustment for Fortran indices:

function kmp_index(S, W) result(res)
    implicit none

    integer :: res

    character(*), intent(in) :: S       ! text to search
    character(*), intent(in) :: W       ! word to find

    integer :: m                        ! zero-based offset in S
    integer :: i                        ! one-based offset in W and T

    integer, dimension(len(W)) :: T     ! KMP table



    call kmp_table(W, T)

    i = 1
    m = 0

    do while (m + i <= len(S))
        if (W(i:i) == S(m + i:m + i)) then
            if (i == len(W)) then
                res = m + 1
                return
            end if
            i = i + 1
        else
            if (T(i) > 0) then
                m = m + i - T(i)
                i = T(i)
            else
                i = 1
                m = m + 1
            end if
        end if
    end do

    res = 0

end function

(The index m is zero-based here, because t is only ever used in conjunction with i in S(m + i:m + i). Adding two one-based indices will yield an offset of one, whereas keeping m zero-based makes this a neutral addition. m is a local variable that isn't exposed to code from the outside.)

Alternatively, you could make your Fortran arrays zero-based by specifying a lower bound of zero for your string and array. That will clash with the useful character(*) notation, though, which always uses one-based indexing. In my opinion, it is better to think about the whole algorithm in the typical one-based indexing scheme of Fortran.

M Oehm
  • 28,726
  • 3
  • 31
  • 42
  • "a lower bound of zero for your string"? – francescalus Apr 04 '16 at 08:55
  • @francescalus: Well, I can say `Character, dimension(0:8)`, but I realise that this isn't really a string, but an array of characters which isn't really compatible with strings. Anyway, I think I've made it clear hat the proper solution should make use of the Fortran way of indexing. – M Oehm Apr 04 '16 at 09:10