Algorithm for piecing together a sequence from multiple fragments

Question

I am working on a real-time embedded system. I am trying to create a detailed timing analysis. I have collected runtime data, recording the start and stop time of each interrupt. Each burst of data looks something like this

 ISR#  time
 -----  ----
  1     34
  end   44
  4     74
  3     80
  end   93
  end   97
  ...

My output channel has limited bandwidth, and my high precision timer overflows a word very quickly, so I am collecting data in ~150 microsecond bursts, then trickling it out over time. From this data I have been able to collect the time spent in each interupt, and the number of calls and pre-emptions.

What I would like to do is put together the complete execution sequence for a typical frame, which is ~2 ms long.

It occurs to me that this is almost like a gene-sequencing problem. I have a few thousand fragments, each covering 7% of the total frame. I should be able to line them up - match the portions which cover the same part of the frame - in such a way that I can construct a single sequence of events for the whole period. There will be some frame-to-frame variations, but I am hoping that these can be accounted for in a best-match type of algorithm.

So my question is: What algorithms exist to do this kind of sequencing? Are there any existing tools not targeted to DNA or Protiens?

Another angle to pursue would be compressing the record such that you can fit 10x more data in your log. Delta compression for the timestamps, fold the addresses into indexes, use variable length records so that 'end' can be expressed more concisely, etc. — Ben Jackson, Oct 29 '10 at 00:00
That's not a bad idea, but it is pretty well compresses already. Each record uses 4 bits for the ID, and 12 bits for the timestamp. I can only reduce the number of bits per stamp by decreasing the resolution of the timer. Getting it down to 14 bits is possible, but that creates its own data manipulation hassle shipping it through a 16 bit oriented serial channel. — AShelly, Oct 29 '10 at 17:14
I don't fully understand yet -- are you performing multiple independent runs, each around 2ms, of which you capture around 150ms each time? Is the sequence of interrupt starts and stops deterministic (the same for each run) or very nearly so? If so then yes, it's just like DNA sequence assembly; if not I don't see how you could hope to match them up. — j_random_hacker, Oct 30 '10 at 07:46
I am performing 1 long run of a system which has a fixed 2 ms processing frame. The same sequence of events (with minor timing variations due to external interfaces) happens every frame. While it is running, I am capturing hundreds of 150ms samples of interrupt activity. I am hoping the variations are minor enough that a best match algotithm can still piece together the fragments. — AShelly, Nov 01 '10 at 20:50

Gintautas Miliauskas · Answer 1 · 2010-10-28T23:54:30.767

Your data seems fairly application-specific, so you may just have to experiment. First see if the order of ISR invocations with interrupt numbers (without timing information) discriminates sufficiently; just take the final N calls of each burst and do a search to find any other bursts with similar fragments near the beginning. You can use any string search algorithm for this task. If too few matches are returned, try a fuzzy search algorithm. If too many matches are returned, try a smarter matching algorithm that also weighs each match by the similarity of timings. Overall this shouldn't be too complicated, since a complete chain is just about 15 bursts, whereas for example in DNA sequencing you need to match up millions of very short fragments.

Algorithm for piecing together a sequence from multiple fragments

1 Answers1