1

A page in author with UUID(jcr:uuid) is activated and its content is replicated onto the 3 associated publish servers. The content available in all the 3 publish servers has different UUIDs. So, considering the same content across all the 4 instances on AEM (1 author + 3 publish), how to associate with something unique?

I'm implementing a solution where I need to associate a unique id that can be mapped to the individual content across all the instances.

Approaches that I've tried till now:

  1. Used the content path - to generate a unique id - by removing the '/' & '-' in the path. The issue faced - For some paths this can be more than 128 chars which is the limit for the service to accept a unique id.

  2. If I generate a unique id programmatically it will work, but how can I try to use that to track the back content? As I cannot store this programmatically created id on the jcr:content and activate the page. Issues - If I replicate the page, it will change the activation date as well- which is also important metadata for the content.

What can be the most feasible solution for the use case? Kindly help with suggestions and possible solutions.

1 Answers1

1

You could use a hash of the content path. Easiest way to get a hash is using hashCode(). For compactness, use the Base64 representation of the hash bytes and truncate after a predetermined number of chars.

Raphael Schweikert
  • 18,244
  • 6
  • 55
  • 75
  • To create a unique id for each page, if we truncate the hashed value created using the path, can there be a scenario that 2 different page paths can have the same unique id? If it is always going to be less than 128 chars, by default, that would be fine to implement. – Abhishek Sinha Jul 02 '21 at 03:33
  • 1
    By referring to some theories online, 2 different string values(page path) can have the same hash value. This solution may fail in that case, as I can override a page's metadata mapping to another page's hash code. – Abhishek Sinha Jul 02 '21 at 04:05
  • 1
    Many hash implementations give you 128 bits – much less than 128 chars. Even if you take SHA-512, which gives you 512 bits (64 bytes), you’ll have plenty of space left over. Base64 has an overhead of 33% so the hash will be 86 chars in length. Also, yes, hash collisions are a thing but unless you specifically worry about attacks (and are hashing attacker-controlled data), they are unlikely enough (especially for reasonably short strings like jcr paths) that you shouldn’t need to concern yourself with them. – Raphael Schweikert Jul 02 '21 at 13:21