A Markov chain is composed of a set of states which can transition to other states with a certain probability.
A Markov chain can be easily represented in Neo4J by creating a node for each state, a relationship for each transition, and then annotating the transition relationships with the appropriate probability.
BUT, can you simulate the Markov chain using Neo4J? For instance, can Neo4J be coerced to start in a certain state and then make transitions to the next state and the next state based upon probabilities? Can Neo4J return with a printout of the path that it took through this state space?
Perhaps this is easier to understand with a simple example. Let's say I want to make a 2-gram model of English based upon the text of my company's tech blog. I spin up a script which does the following:
- It pulls down the text of the blog.
- It iterates over every pair of adjacent letters and creates a node in Neo4J.
- It iterates again over every 3-tuple of adjacent letters and then creates a Neo4J directed relationship between the node represented by the first two letters and the node represented by the last two letters. It initializes a counter on this relationship to 1. If the relationship already exists, then the counter is incremented.
- Finally, it iterates through each node, counts how many total outgoing transitions have occurred, and then creates a new annotation on each relationship of a particular node equal to
count/totalcount
. This is the transition probability.
Now that the Neo4J graph is complete, how do I make it create a "sentence" from my 2-gram model of English? Here is what the output might look like:
IN NO IST LAT WHEY CRATICT FROURE BIRS GROCID PONDENOME OF DEMONSTURES OF THE REPTAGIN IS REGOACTIONA OF CRE.