Markov generation and probabilities

Question

I have been working various implementations of markov chains for a while, and I just want to clarify a generalisation of the chains.

Generation

If i want to generate a sequence of length n, we simply sample from the initial probabilities, then take this state just generated to find the row of the transition matrix, and do this n-1 times? So if the state is "A" from the sample of the initial, we just use the "A" row of the transition matrix as the seed for the next sample?

{I have an implementation in R of markov chains in which each iteration, the initial and the transition matrices are multiplied, the initial by the transition, and the transition by itself. Where or when does one apply this matrix multiplication for chain generation? I have been told that these are used for determining the values of states after some number of repetitions.... but these repetitions are what? I just want to generate states for a particular length of sequence, where does this repetition come in if I am sampling from the transition matrix, based on nucleotide frequencies in the original/input sequence?} - sorted by biziclop below

Probability of User entry

I have seen several implementations here.

Input - "ACGT"

P(ACGT) = P(A) * P(C|A) * P(G|C) * P(T|G)

Does this imply that P(A) is from the initial/start probabilities and that conditional probabilities (P(C|A) etc) are from the transition matrix?

Or does this imply a maximum estimation here, where P(A) = #A's/#nucleotides? And therefore P(C|A) = #C's / #A's?

If entries in the transition are zero, then do we use laplacian estimates or other forms of pseudocounts to combat this?

If so, where does one apply the pseudocounts? Does each entry of the transition matrix get an extra count? If we use the transition matrix to generate the probabilities, then the pseudocounts would have to be added here....no?

A discussion would be helpful. No code or any mathematics need to be given.

You multiply the matrices if you want to generate the `n`th element alone, without generating all the previous ones. Imagine trying to answer the question: "What is the probability of the 100th nucleotide being an A?" What you'd do is take the initial vector, then multiply it with the transition matrix 99 times, and read the result for A from the final vector. — biziclop, Jun 27 '17 at 13:37
@bizclop - thanks for your reply, this does indeed clarify the matrix multiplication aspect — urema, Jun 27 '17 at 13:44

Markov generation and probabilities

0 Answers0