I am trying to construct an artificial intelligence unit. I plan to do this by first collecting sensory input ('observations') into a short-term working-memory list, continually forming patterns found in this list ('ideas'), and committing those ideas to a long-term storage memory when they reach a substantial size, perhaps seven chained observations. For any philosophy folks, similar to Locke's Essay on Human Understanding, but this won't be Tabula Rasa. There needs to be an encoded underlying structure.
Thus, my question is:
Is there/where is a good algorithm for dynamically consolidating or 'pattern-izing' largest substrings of this constantly growing observations string? For example: if I have thus far been given ABCDABCABC, I want an ABC idea, D, and two other ABC ideas; then, if another D is observed and added to the short-term memory, I want an ABCD token, an ABC token, and another ABCD token. I don't want to use Shortest Common Substring, because I would need to rerun it after an arbitrary number of character additions. I think I'd prefer some easily searchable/modifiable tree structure.
Does this look like a decent enough solution? http://www.cs.ucsb.edu/~foschini/files/licenza_spec_thesis.pdf. If nothing else, I think the other data-miners may enjoy.