I want to train a second-order Markov model for a nucleotide sequence using biopython's Bio.MarkovModel.train_visible(). That is, alphabet=["A","T","G","C"], states=["AA","AT","TT"...]
However, I get an error:
474 states_indexes = itemindex(states)
475 outputs_indexes = itemindex(alphabet)
--> 476 for toutputs, tstates in training_data:
477 if len(tstates) != len(toutputs):
478 raise ValueError("states and outputs not aligned")
ValueError: too many values to unpack (expected 2)
Indicating that probably I give I've tried giving my training_data as a pair of lists:
training_data=(['A','T'...],['AA','AT'...])
and as zipped list of this list pair:
training_data=[('A','AA'),('T','AT')...]
but to no avail.
What is the proper format of training_set
?
Thanks!