0

Is there a implementation about compressed suffix array Psi in python? I actually understand how suffix arrays works and to get Psi given a suffix array but is there a way to get this byusing python?. I was searching if there was some library or another king of implementation but didn't came across with something which can be used in python. Here is an example:

offset     0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 #indexes
Text       a  b  b  a  a  b  b  a  a  a  b  a  b  b  b  $ 
SA         15 7  8  3  9  4  0 11 14  6  2 10 13  5  1 12 #suffix array
Psi        $  2  4  5 11 13 14 15  0  1  3  7  8  9 10 12 #Psi

Psi array is obtained by looking for the index. For example for index 1 in Psi array we must look for the value in index 1 in SA (it's 7) now we add 1 to the value (7+1) and see the index associated to that value 8 (in this case 2). For index 2 in Psi we look for the value in index 2 in SA (8) and add 1 (8+1) and see the index associated to that value 9 and turn oout to be 4 and so on.

Eric Darchis
  • 24,537
  • 4
  • 28
  • 49
Steve Jade
  • 131
  • 3
  • 12

1 Answers1

0

Don't really need a library:

Text = "abbaabbaaababbb"

SA = sorted(list(range(0,len(Text)+1)), key=lambda i:Text[i:])

SAINV = [None]*(len(SA)+1)
for i in range(0,len(SA)):
    SAINV[SA[i]]=i

Psi = [SAINV[pos+1] for pos in SA]

print SA
print SAINV
print Psi

Yields:

[15, 7, 8, 3, 9, 4, 0, 11, 14, 6, 2, 10, 13, 5, 1, 12]
[6, 14, 10, 3, 5, 13, 9, 1, 2, 4, 11, 7, 15, 12, 8, 0, None]
[None, 2, 4, 5, 11, 13, 14, 15, 0, 1, 3, 7, 8, 9, 10, 12]
Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87