2

I want to write a function that outputs a suffix array. This is what I have so far:

def suffixArray(s):
    sa = []
    for i in range(len(s)):
        suffix= sorted([s[i:]])
        sa = [len(s)-len(suffix[i:])
    return list(sa)

This outputs an error because I think I'm missing an additional if statement but I'm not really sure how to go about it. And yes, I know that there are probably easier ways to obtain a suffix array but I'm a beginner in python and there are few functions that I can use. Any help is appreciated. Thanks

Also here's an example of what I want my input and output to be: input --> suffixArray('banana') output--> [5, 3, 1, 0, 4, 2]

  • 1
    what does each number in output reprsent? – sam46 Feb 12 '19 at 03:27
  • 1
    Your question isn't entirely clear. Apparently you're looking to output the indices of a suffix array, but not the actual suffix array? That would look like `['a', 'ana', 'anana', 'banana', 'na', 'nana']. – Grismar Feb 12 '19 at 03:27

4 Answers4

5

Apparently you want the index of each suffix after lexicographicly sorting them

s = 'banana'
>>> [t[1] for t in sorted((s[i:],i) for i in range(len(s)))]
[5, 3, 1, 0, 4, 2]

or another way:

>>> sorted(range(len(s)), key=lambda i: s[i:])
[5, 3, 1, 0, 4, 2]
sam46
  • 1,273
  • 9
  • 12
1

For a simple suffix array:

s = 'banana'
sa = sorted([s[i:] for i in range(len(s))])

For an array of suffix indices:

s = 'banana'
usd = {i: s[i:] for i in range(len(s))
sai = [x for x, _ in sorted(d.items(), key=lambda x: x[1])]
Grismar
  • 27,561
  • 4
  • 31
  • 54
1

First, generate an array with suffix pairs: the suffix string, and its number:

suffixes = [(s[i:], i) for i in range(len(s))]

Next, sort this list by the suffix string:

suffixes.sort(key=lambda x: x[0])

Now you can return just the numbers:

return [s[1] for s in suffixes]

Putting it together:

def suffixArray(s):
    suffixes = [(s[i:], i) for i in range(len(s))]
    suffixes.sort(key=lambda x: x[0])

    return [s[1] for s in suffixes]
Nikolas Stevenson-Molnar
  • 4,235
  • 1
  • 22
  • 31
0
def get_suffix_array(str_sample):
    lis = list(str_sample)
    suffix_array = {v:k for k,v in enumerate(["".join(trim_elem) for trim_elem in [lis[-len(str_sample)+idx:] for idx in range(len(str_sample))]])}
    return [suffix_array.get(k) for k in sorted(list(suffix_array.keys()))]

print(get_suffix_array('banana'))

Result: [5, 3, 1, 0, 4, 2]