I am trying to developed a reading evaluator (subset of reading tutor - http://www.cs.cmu.edu/~listen/ which is based on CMUSphnix Speech recognizer). My evaluator would be primarily used to test prosody (or fluency for now) in English, but for Indian accent which is not yet available.
Specifically, I would present the reader with a reading test consisting of a fixed story of say 500-1000 words. The speech data would be recorded and analysed for pauses, breaks, pitch, intensity etc. and finally a score would be assigned to the reader based on the evaluation.
Now. here's the catch. For a newer language, HTK requires (1)Grammar, (2)pronunciation model and (3)acoustic model(training) to be specified before-hand. While in my case, since the story is fixed and is very small compared to the vast vocabulary of English, I think it might not be required to do all that.
I am a very beginner in this field so could somebody direct me to what would be (a) Easiest and less effort way to initially test this on my own for a quick demo (skeleton)? (b) Out of the three models mentioned above, what should be changed and how to develop a reliable testable prototype for like 2-3 fixed stories? (c) Any other help to get me started on this project or any other suggestions/criticisms would be highly appreciated.
P.S. Again, please note that we are only going to use English but test it over an Indian scenario and; our total set of words will be low like about 100-200 in number so I feel that recognition accuracy could be better than usual tools at much lower effort (training, grammar models etc.).
Many thanks.