0

What is the best way to generate persistenceId from complex key (String, String) and every part of this key is not controlled (may contain any symbol)?

  1. If decoding is required
  2. If decoding is not required (original key stored in actor)
andrey.ladniy
  • 1,664
  • 1
  • 11
  • 27

1 Answers1

0

One possibility is to encode the length of the first string of the key as the first characters of the persistenceId. An example in Scala:

import scala.util.Try

type ComplexKey = (String, String)

def persistenceIdFor(key: ComplexKey): String = {
  val (first, second) = key
  val firstLen = first.length
  s"${firstLen};${first},${second}"
}

val PersistenceIdRegex = """^(\d+);(.*)$""".r

// You might not necessarily ever need to get a key for a given persistenceId, but to show the above is invertible
def keyForPersistenceId(persistenceId: String): Option[ComplexKey] =
  persistenceId match {
    case PersistenceIdRegex(firstLenStr, content) =>
      val firstLenTry = Try(firstLenStr.toInt)

      firstLenTry
        .filter(_ <= content.length)
        .toOption
        .map(firstLen => content.splitAt(firstLen))

    case _ => None
  }

Another would be to use escaping, but that can have a lot of subtleties, despite its initially apparent simplicity.

The specific Akka Persistence backend in use may enforce restrictions on the persistenceId (e.g. length of IDs).

Levi Ramsey
  • 18,884
  • 1
  • 16
  • 30
  • thank you for the example and comment about `the specific Akka Persistence backend`. How about "key is not controlled (may contain any symbol)". I think these parts must be encoded, but how? Is `Base64URL` the right choice? PersistenceId will depend of Cassandra's PK length restrictions – andrey.ladniy Sep 30 '20 at 03:34
  • I also think about hash as PersistenceID, but the actor will have a list of original PK, and each shoud have its own state. – andrey.ladniy Sep 30 '20 at 03:41
  • 1
    To my knowledge, anything you can put in a string is a valid character in the common Akka Persistence backends, so there's no real reason IMO to base64 it (which would tend to lengthen the strings if nothing else). By default, Akka Persistence Cassandra uses `text` as the column type for `persistenceId`, so that's a 2GB limit (a little less than that since the primary key includes some smaller columns). – Levi Ramsey Sep 30 '20 at 12:24
  • 1
    As for using a hash, you could do that, especially if you're storing what's being hashed in the actor's persistent state. Just be careful to not use the `hashCode` or `##` methods, as those are not guaranteed to be consistent: explicitly choosing something like murmur3 as a hashing algorithm and sticking with it for nearly all time is what you'd need to do in that situation. – Levi Ramsey Sep 30 '20 at 12:28
  • 1
    Of the common Akka Persistence backends, the tightest `persistenceId` length limit I know of is JDBC on Postgres, which has a 255 character limit. – Levi Ramsey Sep 30 '20 at 12:31