Given a set of strings, I would like to automatically compress each string into a minimum length regular expression. The regular expression for two different strings should only be the same if these strings are identical.
For example:
String 1: ABCCCCCCCCABCCCCCCCCCCBBC = (AB[C]{8}){2}CCBBC
String 2: ABCCCCABCCCCCCCCCCBBC = (AB[C]{4}){2}C{6}BBC
*This is an example of the compression I mean even though it may not be the shortest way of doing it.
Note that string length matters: There is no need to use B{2} to represent string BB as this takes up more characters.
Is there an established method for doing this?
An answer would be a pointer to any academic investigations into this problem with an explanation and/or a solution to this problem, whether theoretical, or as an implementation. In the latter case, I would prefer it if this implementation was in Java.