5

Today I am trying to create suffix arrays using scala. I was able to do it with massive lines of code but then I heard that it can be created by using only few lines by using zipping and sorting.

The problem I have at the moment is with the beginning. I tried using binary search and zipWithIndex to create the following "tree" but so far I haven't been able to create anything. I don't even know if it is possible by only using a line but I bet it is lol.

What I want to do is to get from a word "cheesecake" is a Seq:

 Seq((cheesecake, 0),
     (heesecake, 1),
     (eesecake, 2),
     (esecake, 3),
     (secake, 4),
     (ecake, 5),
     (cake, 6),
     (ake, 7),
     (ke, 8),
     (e, 9))

Could someone nudge me to the correct path ?

Machavity
  • 30,841
  • 27
  • 92
  • 100
Duzzz
  • 191
  • 3
  • 14
  • 1
    Thanks a lot to all of you. My code looks much better now :) Till the next time I am stuck – Duzzz May 06 '15 at 13:52
  • You might find this Haskell implementation to be interesting - http://codereview.stackexchange.com/questions/66952/create-suffixes-function-on-list – Kevin Meredith May 07 '15 at 15:35

4 Answers4

7

To generate all the possible postfixes of a String (or any other scala.collection.TraversableLike) you can simply use the tails method:

scala> "cheesecake".tails.toList
res25: List[String] = List(cheesecake, heesecake, eesecake, esecake, secake, ecake, cake, ake, ke, e, "")

If you need the indexes, then you can use GenIterable.zipWithIndex:

scala> "cheesecake".tails.toList.zipWithIndex
res0: List[(String, Int)] = List((cheesecake,0), (heesecake,1), (eesecake,2), (esecake,3), (secake,4), (ecake,5), (cake,6), (ake,7), (ke,8), (e,9), ("",10))
mutantacule
  • 6,913
  • 1
  • 25
  • 39
2

You're looking for the .scan methods, specifically .scanRight (since you want to start build from the end (ie right-side) of the string, prepending the next character (look at your pyramide bottom to top)).

Quoting the documentation :

Produces a collection containing cumulative results of applying the operator going right to left.

Here the operator is :

  • Prepend the current character
  • Decrement the counter (since your first element is "cheesecake".length, counting down)

So :

scala> s.scanRight (List[(String, Int)]())
                   { case (char, (stringAcc, count)::tl) => (char + stringAcc, count-1)::tl
                     case (c, Nil) => List((c.toString, s.length-1))
                   }
        .dropRight(1)
        .map(_.head)
res12: scala.collection.immutable.IndexedSeq[List[(String, Int)]] =
           Vector((cheesecake,0),
                  (heesecake,1),
                  (eesecake,2),
                  (esecake,3),
                  (secake,4),
                  (ecake,5),
                  (cake,6),
                  (ake,7),
                  (ke,8),
                  (e,9)
                )

The dropRight(0) at the end is to remove the (List[(String, Int)]()) (the first argument), which serves as the first element on which to start building (you could pass the last e of your string and iterate on cheesecak, but I find it easier to do it this way).

Marth
  • 23,920
  • 3
  • 60
  • 72
  • Well yeah, I did see the answer (and upvoted it, because it is much clearer than mine for this problem). But seeing the 'pyramid' in the question made me think of `.scan`, which is a general solution for this type of problem (folding and accumulating the intermediary results) (which isnt required here). – Marth May 06 '15 at 14:43
  • sorry i had posted that under the wrong answer. that is i stole your recommendation for the dropRight and plopped it under kossi's where i meant to plop it. good work :> – Drew May 06 '15 at 14:50
1

One approach,

"cheesecake".reverse.inits.map(_.reverse).zipWithIndex.toArray

Scala strings are equipped with ordered collections methods such as reverse and inits, the latter delivers a collection of strings where each string has dropped the latest character.

elm
  • 20,117
  • 14
  • 67
  • 113
1

EDIT - From a previous suffix question that I posted (from an Purely Functional Data Structures exercise, I believe that suffix should/may include the empty list too, i.e. "" for String.

scala> def suffix(x: String): List[String] = x.toList match {
     |    case Nil             => Nil
     |    case xxs @ (_ :: xs) => xxs.mkString :: suffix(xs.mkString)
     | }
suffix: (x: String)List[String]

scala> def f(x: String): List[(String, Int)] = suffix(x).zipWithIndex
f: (x: String)List[(String, Int)]

Test

scala> f("cheesecake")
res10: List[(String, Int)] = List((cheesecake,0), (heesecake,1), (eesecake,2), 
            (esecake,3), (secake,4), (ecake,5), (cake,6), (ake,7), (ke,8), (e,9))
Community
  • 1
  • 1
Kevin Meredith
  • 41,036
  • 63
  • 209
  • 384