I have an application that manages a large number of strings. Strings are in a path-like format and have many common parts, but without a clear rule. They are not paths on the file-system but can be considered like so. I clearly need to optimize memory consumption but without a big performance sacrifice.
I am considering 2 options:
- implement a compressed_string
class that stores data zipped, but i need a fixed dictionary and i cant find a library for this right now. I don't want a Huffman on bytes, I want it on words.
- implement some kind of flyweight
pattern on string parts.
The problem looks like a common one and I'm wonder what is the best solution to it or if someone knows a library that targets this issue.
thanks