How to do Fast Cross-Correlation in Arduino (Part of Real Time Instant Signal Recognizer)

Question

I'm having a hard time to write a code which samples a signal (in pin A0) and does a Real-Time Instant CrossCorrelation with another known signal (which is saved in the flash memory of the Arduino Uno).

My problem is that my code (my equation) includes lots of loops which makes it hard on the arduino to do it in real time and instantly after sampling a signal (using ISR(TIMER1_COMPA_vect)).

Is there any idea of how to implement the cross correlation so it wont take lots of time each sample?

Cross Correlation Equation:

for the record, here is what I have done already:

#define NUM_OF_SAMPLES 64
#define NUM_OF_ETALONS 6

const double* const Etalon_Signals[NUM_OF_ETALONS] PROGMEM = {sinWave1, sinWave2, sinWave3, rectWave4, rectWave5, triWave6};
const double Etalon_Signals_Mean[NUM_OF_ETALONS] PROGMEM = {sinWaveMean1, sinWaveMean2, sinWaveMean3, rectWaveMean4, rectWaveMean5, triWaveMean6};
const double Etalon_Signals_Variance[NUM_OF_ETALONS] PROGMEM = {sinWaveVariance1, sinWaveVariance2, sinWaveVariance3, rectWaveVariance4, rectWaveVariance5, triWaveVariance6};
//-------------- SAMPLED SIGNAL ---------------------
volatile double SignalSamples[NUM_OF_SAMPLES];
volatile int idx = 0;
//-------------- CORRELATION VARS -------------------
// Correlations[NUM_OF_ETALONS] -         array of 6 values which calculate the correlation for particular 'd' (delay) value.
// MaxCorrelationValues[NUM_OF_ETALONS] - array which holds the maximum value of a praticular Etalon signal,
//                                        which I use later to decide which wave is most simmilar to the sampled wave.
//                                        MaxCorrelationValues[k] = the maximum correlation value out of all the 'd' iterations, 
//                                        of Etalon wave number k (look at the formula for better understanding).
// MostSimillarCorrVal -                  Holds the Maximum value of correlation out of 6 Etalons (every Etalon has a maximum correlation point out of d correlations)
//                                        MostSimillarCorrVal will hold the Absulote Maximum out of the 6 Etalons.
// SignalSamplesVar -                     Variance of the Sampled Signal -> Eq: (sum(y[i] - my))
// SignalSamplesMean -                    Mean of the Sampled Signal -> Eq: (my)
// counter -                              used for smart Circular scanning over the ring buffer "SignalSamples".
// maxCorrWave -                          Holds the 2^index of the most simmilar wave out of the etalons.
// SignalSamplesVarTemp -                 a particular substraction of the signal with its mean value -> Eq: (numerator) -> (y(i-d)-my).
volatile double Correlations[NUM_OF_ETALONS] = {0}, MaxCorrelationValues[NUM_OF_ETALONS] = {0}, MostSimillarCorrVal = 0;
volatile double SignalSamplesVar = 0, SignalSamplesMean = 0;
volatile int counter = 0, maxCorrWave = 0;
double EtalonVarianceVal[NUM_OF_ETALONS], SignalSamplesVarTemp;


void setup()
{
  SetTimer1CTCforFreq(SAMPLING_FREQUENCY);
  DDRB |= WAVE_1 | WAVE_2 | WAVE_3 | WAVE_4 | WAVE_5 | WAVE_6;
  pinMode(A0, INPUT);
  for(int i = 0; i<NUM_OF_ETALONS;i++)
    EtalonVarianceVal[i] = pgm_read_float(&Etalon_Signals_Variance[i]);
}

ISR(TIMER1_COMPA_vect)
{
  SignalSamplesVar -= (SignalSamples[idx] - SignalSamplesMean)*(SignalSamples[idx] - SignalSamplesMean);
  SignalSamplesMean = SignalSamplesMean * NUM_OF_SAMPLES - SignalSamples[idx];
  SignalSamples[idx] = analogRead(ANALOG_INPUT_PIN);                        // sampling
  SignalSamplesMean = (SignalSamplesMean + SignalSamples[idx])/NUM_OF_SAMPLES;
  SignalSamplesVar +=  (SignalSamples[idx] - SignalSamplesMean)*(SignalSamples[idx] - SignalSamplesMean);
  // SignalSamplesMean = Mean(SignalSamples, NUM_OF_SAMPLES);                  // Mean of current state calculations.
  // SignalSamplesVar = VarianceSquared(SignalSamples, NUM_OF_SAMPLES);        // Variance of current state calculations.
  for(int delay = -NUM_OF_SAMPLES; delay < NUM_OF_SAMPLES; delay++)         // "shifting" array loop.
  {
    counter = 0;                                                            // counter - an index which runs on SignalSamples array.
    while(counter < NUM_OF_SAMPLES)                                         // loop running on the SignalSamples in circle.
    {
      // SignalSamplesVarTemp = Signal's "Variance" semi-calculation.
      SignalSamplesVarTemp = SignalSamples[(idx + counter - delay)%NUM_OF_SAMPLES] - SignalSamplesMean;
      // Calculating all Cross Correlations with all the six Etalons.
      for(int k = 0; k < NUM_OF_ETALONS; k++)
        Correlations[k] += pgm_read_float(&Etalon_Signals[k][counter]) * SignalSamplesVarTemp; // eMean = 0 always.
      counter++;
    }
    for(int k = 0; k < NUM_OF_ETALONS; k++) // updating all the Etalons.
    {
      Correlations[k] *= Correlations[k];
      Correlations[k] /= SignalSamplesVar * pgm_read_float(&EtalonVarianceVal[k]);        // The Correlation for particular delay.
      // MaxCorrelationValues[k] - Max value of the correlation with Etalon[k]. 
      // At the end we compare which etalon got the biggest correlation value (which is positive).
      MaxCorrelationValues[k] = MaxCorrelationValues[k] < Correlations[k] ? Correlations[k] : MaxCorrelationValues[k]; 
    }
  }
  for(int k = 0; k < NUM_OF_ETALONS; k++)
  {
    if(MaxCorrelationValues[k] > MostSimillarCorrVal)
    {
      MostSimillarCorrVal = Correlations[k];
      maxCorrWave = k;
    }
  }
  PORTB = 1 << maxCorrWave;
  idx = (idx+1)%NUM_OF_SAMPLES;
}

Thanks for those who help

As far as I know, it is always a bad idea to do heavy calculations inside an ISR. In the ISR you just save the critical data, and then you handle that data outside of the ISR. How? It depends on the architecture / design chosen. — virolino, Jul 20 '20 at 06:43
If the numbers are small enough and there is no risk of overflow, you can extend the square-root over the product - and that saves you a lot of processing power and time. — virolino, Jul 20 '20 at 06:46
For sure, that's why there is no sqrt() function in the code. It's still not working well. I dont know how to minimize the amount of loops. — Jhon Margalit, Jul 20 '20 at 08:51
The problem of doing heavy calculations inside the ISR still remains. Anyway, the thing is that I do not see how the code you provided implements the formula above - possibly because correlations are not what I use every day (read: not at all in the last 20+ years). If you explain a bit what you did, I might be more helpful. — virolino, Jul 20 '20 at 09:07

virolino · Accepted Answer · 2020-07-22T11:54:04.783

1

I will try to be generic - the solution still being applicable to your problem.

NOTE: I look to optimize the equation, not the code. I do not understand how the code relates to the equation.

Let's assume you need to only find the sum of the last N samples of the signal.

Let the samples be

x0, x1, ..., x(N-1)

Let the first sum be

S0 = x0 + x1 + ... + x(N-1)

Then the new sample arrives:

xN

The sum being

S1 = x1 + x2 + ... + x(N-1) + xN

But this sum can be rewritten as:

S1 = S0 - x0 + xN

Thus, you only need to remember 3 numbers:

previous / current sum (S0);
oldest sample value (x0);
newest sample value (xN).

Therefore, instead of calculating (N-2) additions, you calculate only 2 (actually only an addition and a subtraction).

If you need to calculate the sum of

x1 - xm, x2 - xm ...

you need to notice that when calculation S1 you will have

S1 = S0 - (x0-xm) + (xN-xm)

which translates to

S1 = S0 - x0 + xm

which saves you adding and subtracting xm unnecessarily.

With the proper modifications, you can have the entire equation calculated in just a few steps, every time a sample arrives.

Except the sqrt() - I have no idea how to help there.

Please note that there will be no heavy work even at the beginning. You start with all samples at value zero. As the real samples come, "the first" S0 is built - which will be used subsequently.

Actually, "the first" S0 will be

S0 = x0

Another optimization would be to calculate

y(i-d) - my

once, and use it twice. It is not much, but still it will help. Also, if you are sure that mx will not change, take it out of the formulas completely. Extracting zero is just time consuming for no benefit.

Another trick, if the application allows, would be decrease the sampling rate. In that way, the samples (and therefore the interrupts) come slower, and the processor has more time to do the job.

edited Jul 22 '20 at 11:54

answered Jul 20 '20 at 09:29

virolino

2,073
5
21

You are correct, but there is nothing new by now. the very first lines in the ISR are doing exactly what you just have said. The sqrt() can be removed since Im squaring the whole equation, so there is no need for sqrt(). my problem is that I dont know if I can remove the first loop which goes over the 'delay' variable (in the equation it is the d). – Jhon Margalit Jul 20 '20 at 10:49
See? You finally started to do the right thing. Please edit the question and add this information there. Additionally, **describe** carefully: the **equations** you intend to use, the **pseudo-code** of what you want to do (without optimizations), the **optimizations** that you have in mind, and the **real-life code**, including the optimizations. After this, we can analyze what you can still do, if anything can still be done. If they are not obvious, also describe how the things above relate to each other. E.g., which variable in the code relates to `mx` in the original equation. – virolino Jul 20 '20 at 11:07
Don't the values of `mx` and `my` change with each new sample? – the busybee Jul 20 '20 at 13:10
Maybe they change, but I do not know how, maybe OP can help... – virolino Jul 20 '20 at 13:49
'''my''' does change, '''mx''' is known (It is the Mean of an Etalon signal which is saved in the Flash memory (and in my case it is 0 too). Since X[] is an Etalon, it is constant, so mx Does Not Change. – Jhon Margalit Jul 21 '20 at 14:12
@JhonMargalit: BTW, are you sure that your processor has enough power to actually do what you want? – virolino Jul 22 '20 at 05:44
@JhonMargalit: what about `my`? what is the formula for it (if it is not constant)? – virolino Jul 22 '20 at 05:46
@virolino `my` is Mean of input signal. The formula is: `SUM(y)/LENGTH_OF_Y`, just an average. but I wont do a loop over that because it is precious time, so what I do is every ISR (interrupt) I'm subtracting the oldest value of `y` from the `my`, and then when I sample at that iteration, I add the sample to the average. this is way faster than doing a loop over the samples of `y` each ISR iteration (which is `O(n)`). And about the processor power, yes, I'm sure it is possible. – Jhon Margalit Jul 22 '20 at 09:50

How to do Fast Cross-Correlation in Arduino (Part of Real Time Instant Signal Recognizer)

1 Answers1